Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Awk: printing undetermined number of columns

Tags:

awk

I have a file that contains a number of fields separated by tab. I am trying to print all columns except the first one but want to print them all in only one column with AWK. The format of the file is

col 1   col 2   ... col n

There are at least 2 columns in one row.

Sample

2012029754      901749095
2012028240      901744459       258789
2012024782      901735922
2012026032      901738573       257784
2012027260      901742004
2003062290      901738925       257813  257822
2012026806      901741040
2012024252      901733947       257493
2012024365      901733700
2012030848      901751693       260720  260956  264843  264844

So I want to tell awk to print column 2 to column n for n greater than 2 without printing blank lines when there is no info in column n of that row, all in one column like the following.

901749095
901744459
258789
901735922
901738573
257784
901742004
901738925
257813
257822
901741040
901733947
257493
901733700
901751693
260720
260956
264843
264844

This is the first time I am using awk, so bear with me. I wrote this from command line which works:

awk '{i=2; 
while ($i ~ /[0-9]+/)
{ 
    printf "%s\n", $i
    i++
}
}' bth.data

It is more of a seeking approval than asking a question whether it is the right way of doing something like this in AWK or is there a better/shorter way of doing it.

Note that the actual input file could be millions of lines.

Thanks

like image 775
Hameed Avatar asked Aug 08 '12 23:08

Hameed


2 Answers

Is this what you want as output?

awk '{for(i=2; i<=NF; i++) print $i}' bth.data

gives

901749095
901744459
258789
901735922
901738573
257784
901742004
901738925
257813
257822
901741040
901733947
257493
901733700
901751693
260720
260956
264843
264844

NF is one of several pre-defined awk variables. It indicates the number of fields on a given input line. For instance, it is useful if you want to always print out the last field in a line print $NF. Or of course if you want to iterate through all or part of the fields on a given line to the end of the line.

like image 86
Levon Avatar answered Dec 23 '22 17:12

Levon


Seems like awk is the wrong tool. I would do:

cut -f 2- < bth.data | tr -s '\t' '\n'

Note that with -s, this avoids printing blank lines as stated in the original problem.

like image 33
William Pursell Avatar answered Dec 23 '22 17:12

William Pursell