awk extract a column and output a file named by the column header

Question

I have a .txt file like this:

col1    col2    col3    col4
1   3   4   A
2   4   6   B
3   1   5   D
5   3   7   F

I want to extract every single column (i) after column 1 and output column1 and column i into a new file named by the header of column i.

That means that I will have three output files named "col2.reform.txt", "col3.reform.txt" and "col4.reform.txt" respectively.

For example, the output "col2.reform.txt" file will look like this:

col1    col2
1   3
2   4
3   1
5   3

I tried my code like this:

awk '{for (i=1; i <=NF; i++) print $1"	"$i > ("{awk 'NR==1' $i}"".reform.txt")}' inputfile

And apparently the "{awk 'NR==1' $i}" part does not work, and I got a file named {awk 'NR==1' $i}.reform.txt.

How can I get the file name correctly? Thanks!

PS: how can I deleted the file "{awk 'NR==1' $i}.reform.txt" in the terminal?

Edited: The above column name is just an example. I would prefer to use commands that extract the header of the column name, as my file in reality uses different words as the header.

RavinderSingh13 · Accepted Answer

Based on your shown samples, could you please try following. Written with shown samples in GNU awk.

awk '
FNR==1{
  for(i=1;i<=NF;i++){
    heading[i]=$i
  }
  next
}
{
  for(i=2;i<=NF;i++){
    close(outFile)
    outFile="col"i".reform.txt"
    if(!indVal[i]++){ print heading[1],heading[i] > (outFile) }   
    print $1,$i >> (outFile)
  }
}
' Input_file

Output files will be created with names eg--> col2.reform.txt, col3.reform.txt, col4.reform.txt and so on...

sample of col2.reform.txt content will be as follows:

cat col2.reform.txt
col1 col2
1 3
2 4
3 1
5 3

Explanation: Adding detailed explanation for above.

awk '                             ##Starting awk program from here.
FNR==1{                           ##Checking condition if this is first line then do following.
  for(i=1;i<=NF;i++){             ##Traversing through all fields of current line.
    heading[i]=$i                 ##Creating heading array with index of i and value of current field.
  }
  next                            ##next will skip all further statements from here.
}
{
  for(i=2;i<=NF;i++){             ##Traversing from 2nd field to till last field of all rest of lines.
    close(outFile)                ##Closing outFile to avoid too many opened files error.
    outFile="col"i".reform.txt"   ##Creating outFile which has output file name in it.
    if(!indVal[i]++){ print heading[1],heading[i] > (outFile) }   
                                  ##Checking condition if i is NOT present in indVal then print 1st element of heading and current element of heading into outFile.
    print $1,$i >> (outFile)      ##Printing 1st field and current field values to output file here.
  }
}
' Input_file                      ##Mentioning Input_file name here.

karakfa · Answer

here's a similar one...

$ awk 'NR==1 {n=split($0,h)} 
             {for(i=2;i<=n;i++) print $1,$i > (h[i]".reform.txt")}' file

==> col2.reform.txt <==
col1 col2
1 3
2 4
3 1
5 3

==> col3.reform.txt <==
col1 col3
1 4
2 6
3 5
5 7

==> col4.reform.txt <==
col1 col4
1 A
2 B
3 D
5 F

awk extract a column and output a file named by the column header

Tags:

awk

zzz

2 Answers

RavinderSingh13

karakfa

Recent Activity

Donate For Us

awk extract a column and output a file named by the column header

Tags:

awk

zzz

2 Answers

RavinderSingh13

karakfa

Related questions

Recent Activity

Donate For Us