In AWK, is it possible to specify "ranges" of fields?

Tags:

awk

People also ask

How do I find the number of fields in awk?

awk with NF (number of fields) variable. NF is a built-in variable of awk command which is used to count the total number of fields in each line of the input text.

What is a field in awk?

A field is a component of a record delimited by a field separator. By default, awk sees whitespace, such as spaces, tabs, and newlines, as indicators of a new field. Specifically, awk treats multiple space separators as one, so this line contains two fields: raspberry red.

What does NF mean in awk?

NF is a predefined variable whose value is the number of fields in the current record. awk automatically updates the value of NF each time it reads a record. No matter how many fields there are, the last field in a record can be represented by $NF . So, $NF is the same as $7 , which is ' example.

Besides the awk answer by @Jerry, there are other alternatives:

Using cut (assumes tab delimiter by default):

cut -f32-58 foo >bar

Using perl:

perl -nle '@a=split;print join "\t", @a[31..57]' foo >bar

Mildly revised version:

BEGIN { s = 32; e = 57; }

      { for (i=s; i<=e; i++) printf("%s%s", $(i), i<e ? OFS : "\n"); }

You can do it in awk by using RE intervals. For example, to print fields 3-6 of the records in this file:

$ cat file
1 2 3 4 5 6 7 8 9
a b c d e f g h i

would be:

$ gawk 'BEGIN{f="([^ ]+ )"} {print gensub("("f"{2})("f"{4}).*","\\3","")}' file
3 4 5 6
c d e f

I'm creating an RE segment f to represent every field plus it's succeeding field separator (for convenience), then I'm using that in the gensub to delete 2 of those (i.e the first 2 fields), remember the next 4 for reference later using \3, and then delete what comes after them. For your tab-separated file where you want to print fields 32-57 (i.e. the 26 fields after the first 31) you'd use:

gawk 'BEGIN{f="([^\t]+\t)"} {print gensub("("f"{31})("f"{26}).*","\\3","")}' file

The above uses GNU awk for it's gensub() function. With other awks you'd use sub() or match() and substr().

EDIT: Here's how to write a function to do the job:

gawk '
function subflds(s,e,   f) {
   f="([^" FS "]+" FS ")"
   return gensub( "(" f "{" s-1 "})(" f "{" e-s+1 "}).*","\\3","")
}
{ print subflds(3,6) }
' file
3 4 5 6
c d e f

Just set FS as appropriate. Note that this will need a tweak for the default FS if your input file can start with spaces and/or have multiple spaces between fields and will only work if your FS is a single character.

I'm late but this is quick at to the point so I'll leave it here. In cases like this I normally just remove the fields I don't need with gsub and print. Quick and dirty example, since you know your file is delimited by tabs you can remove the first 31 fields:

awk '{gsub(/^(\w\t){31}/,"");print}'

example of removing 4 fields because lazy:

printf "a\tb\tc\td\te\tf\n" | awk '{gsub(/^(\w\t){4}/,"");print}'

Output:

e   f

This is shorter to write, easier to remember and uses less CPU cycles than horrendous loops.

You can use a combination of loops and printf for that in awk:

#!/bin/bash

start_field=32
end_field=58

awk -v start=$start_field -v end=$end_field 'BEGIN{OFS="\t"}
{for (i=start; i<=end; i++) {
    printf "%s" $i;
    if (i < end) {
        printf "%s", OFS;
    } else {
        printf "\n";
    }
}}'

This looks a bit hacky, however:

it properly delimits your output based on the specified OFS, and
it makes sure to print a new line at the end for each input line in the file.

Related questions
                            
                                Parsing the first column of a csv file to a new file
                            
                                Using awk to print characters of specific index on a line
                            
                                Assign AWK result to variable [duplicate]
                            
                                Filter log file entries based on date range
                            
                                How can I get awk to print without white space?
                            
                                Print rest of the fields in awk
                            
                                How to print 5 consecutive lines after a pattern in file using awk [duplicate]
                            
                                awk OR statement
                            
                                Removing trailing / starting newlines with sed, awk, tr, and friends
                            
                                Scripts for computing the average of a list of numbers in a data file
                            
                                Remove odd or even lines from a text file
                            
                                Extraction of data from a simple XML file
                            
                                Awk/Unix group by
                            
                                How to extract last part of string in bash?
                            
                                How can I set the grep after context to be "until the next blank line"?
                            
                                How do I add a line of text to the middle of a file using bash?
                            
                                AWK to print field $2 first, then field $1
                            
                                generating frequency table from file
                            
                                Convert all number abbreviations to numeric values in a text file
                            
                                Linux bash script to extract IP address

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In AWK, is it possible to specify "ranges" of fields?

Tags:

awk

People also ask

Related questions

Recent Activity

Donate For Us