Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Organize a file with AWK

Well, I have the following file:

Progeny Sire    Dam    Sex  PENAS       P35         P41
13254   11908   11421   M   47.275811   1322.828674 1719.183748
13323   11335   11386   M   43.29896    1225.57111  1634.436447
13562   11864   11895   M   47.884191   1228.568357 1615.427502
13338   11335   11970   M   45.780973   1196.32757  1561.900145

I need to transpose the PENAS, P35 and P41 the columns. These columns will be the new column: age. Visually, I need that file be this way:

Progeny Sire    Dam     Sex AGE     Peso
13254   11908   11421   M   PENAS   47.275811
13254   11908   11421   M   P35     1322.828674
13254   11908   11421   M   P41     1719.183748
13323   11335   11386   M   PENAS   43.29896
13323   11335   11386   M   P35     1225.57111
13323   11335   11386   M   P41     1634.436447
13562   11864   11895   M   PENAS   47.884191
13562   11864   11895   M   P35     1228.568357
13562   11864   11895   M   P41     1615.427502
13338   11335   11970   M   PENAS   45.780973
13338   11335   11970   M   P35     1196.32757
13338   11335   11970   M   P41     1561.900145

I tried this command but it did not work:

awk 'NR==1{h=$1 OFS $2 OFS $3 OFS $4 OFS $5 OFS $6 OFS $7; next}
            {a[$2]=(($1 in a)?(a[$1] OFS $NF):(OFS $5 OFS $6 OFS $7 OFS "AGE"));
             if(!($7 in b)) {h=h OFS $7; b[$7]}}
        END{print h; for(k in a) print k,a[k]}' a.txt | column -t > b

And I'm stuck at that point, any suggestion please? Thanks. Note, my original daset has 1400 rows.

like image 249
Greg Rov Avatar asked Dec 14 '22 16:12

Greg Rov


2 Answers

here is another awk, not dependent on number of columns...

$ awk 'NR==1{n=split($0,h); 
             for(i=1;i<=NF-3;i++) printf "%s", $i OFS; 
             printf "%s\n", "AGE" OFS "Peso"; next} 
            {split($0,p); 
             NF--; 
             for(i=1;i<=3;i++) 
               {$(NF-1)=h[NF-2+i]; 
                $NF=p[NF-2+i]; 
                print}}' file | column -t


Progeny  Sire   Dam    Sex  AGE    Peso
13254    11908  11421  M    PENAS  47.275811
13254    11908  11421  M    P35    1322.828674
13254    11908  11421  M    P41    1719.183748
13323    11335  11386  M    PENAS  43.29896
13323    11335  11386  M    P35    1225.57111
13323    11335  11386  M    P41    1634.436447
13562    11864  11895  M    PENAS  47.884191
13562    11864  11895  M    P35    1228.568357
13562    11864  11895  M    P41    1615.427502
13338    11335  11970  M    PENAS  45.780973
13338    11335  11970  M    P35    1196.32757
13338    11335  11970  M    P41    1561.900145
like image 149
karakfa Avatar answered Jan 19 '23 03:01

karakfa


With GNU awk for gensub():

$ cat tst.awk
BEGIN { numPfx=4 }
{ pfx = gensub("((\\S+\\s+){"numPfx"}).*","\\1",1) }
NR==1 {
    split($0,ages)
    print pfx, "AGE", "Peso"
    next
}
{
    for (i=numPfx+1; i<=NF; i++) {
        print pfx, ages[i], $i
    }
}

$ awk -f tst.awk file | column -t
Progeny  Sire   Dam    Sex  AGE    Peso
13254    11908  11421  M    PENAS  47.275811
13254    11908  11421  M    P35    1322.828674
13254    11908  11421  M    P41    1719.183748
13323    11335  11386  M    PENAS  43.29896
13323    11335  11386  M    P35    1225.57111
13323    11335  11386  M    P41    1634.436447
13562    11864  11895  M    PENAS  47.884191
13562    11864  11895  M    P35    1228.568357
13562    11864  11895  M    P41    1615.427502
13338    11335  11970  M    PENAS  45.780973
13338    11335  11970  M    P35    1196.32757
13338    11335  11970  M    P41    1561.900145

With other awks you'd just replace gensub() with a variable plus sub() and replace \\S with [^[:space:]] and \\s with [[:space:]].

like image 27
Ed Morton Avatar answered Jan 19 '23 02:01

Ed Morton