I have a .txt file like this:
col1 col2 col3 col4
1 3 4 A
2 4 6 B
3 1 5 D
5 3 7 F
I want to extract every single column (i) after column 1 and output column1 and column i into a new file named by the header of column i.
That means that I will have three output files named "col2.reform.txt", "col3.reform.txt" and "col4.reform.txt" respectively.
For example, the output "col2.reform.txt" file will look like this:
col1 col2
1 3
2 4
3 1
5 3
I tried my code like this:
awk '{for (i=1; i <=NF; i++) print $1"\t"$i > ("{awk 'NR==1' $i}"".reform.txt")}' inputfile
And apparently the "{awk 'NR==1' $i}" part does not work, and I got a file named {awk 'NR==1' $i}.reform.txt.
How can I get the file name correctly? Thanks!
PS: how can I deleted the file "{awk 'NR==1' $i}.reform.txt" in the terminal?
Edited: The above column name is just an example. I would prefer to use commands that extract the header of the column name, as my file in reality uses different words as the header.
Based on your shown samples, could you please try following. Written with shown samples in GNU awk.
awk '
FNR==1{
for(i=1;i<=NF;i++){
heading[i]=$i
}
next
}
{
for(i=2;i<=NF;i++){
close(outFile)
outFile="col"i".reform.txt"
if(!indVal[i]++){ print heading[1],heading[i] > (outFile) }
print $1,$i >> (outFile)
}
}
' Input_file
Output files will be created with names eg--> col2.reform.txt, col3.reform.txt, col4.reform.txt and so on...
sample of col2.reform.txt content will be as follows:
cat col2.reform.txt
col1 col2
1 3
2 4
3 1
5 3
Explanation: Adding detailed explanation for above.
awk ' ##Starting awk program from here.
FNR==1{ ##Checking condition if this is first line then do following.
for(i=1;i<=NF;i++){ ##Traversing through all fields of current line.
heading[i]=$i ##Creating heading array with index of i and value of current field.
}
next ##next will skip all further statements from here.
}
{
for(i=2;i<=NF;i++){ ##Traversing from 2nd field to till last field of all rest of lines.
close(outFile) ##Closing outFile to avoid too many opened files error.
outFile="col"i".reform.txt" ##Creating outFile which has output file name in it.
if(!indVal[i]++){ print heading[1],heading[i] > (outFile) }
##Checking condition if i is NOT present in indVal then print 1st element of heading and current element of heading into outFile.
print $1,$i >> (outFile) ##Printing 1st field and current field values to output file here.
}
}
' Input_file ##Mentioning Input_file name here.
here's a similar one...
$ awk 'NR==1 {n=split($0,h)}
{for(i=2;i<=n;i++) print $1,$i > (h[i]".reform.txt")}' file
==> col2.reform.txt <==
col1 col2
1 3
2 4
3 1
5 3
==> col3.reform.txt <==
col1 col3
1 4
2 6
3 5
5 7
==> col4.reform.txt <==
col1 col4
1 A
2 B
3 D
5 F
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With