Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sum rows based on unique columns awk

Tags:

unique

count

awk

I'm looking for a more elegant way to do this (for more than >100 columns):

awk '{a[$1]+=$4}{b[$1]+=$5}{c[$1]+=$6}{d[$1]+=$7}{e[$1]+=$8}{f[$1]+=$9}{g[$1]+=$10}END{for(i in a) print i,a[i],b[i],c[i],d[i],e[i],f[i],g[i]}'

Here is the input:

 a1 1   1   2   2
 a2 2   5   3   7
 a2 2   3   3   8
 a3 1   4   6   1
 a3 1   7   9   4
 a3 1   2   4   2

and output:

 a1 1 1 2 2
 a2 4 8 6 15
 a3 3 13 19 7

Thanks :)

like image 843
user2904120 Avatar asked Mar 21 '23 14:03

user2904120


2 Answers

I break the one-liner down into lines, to make it easier to read.

awk '{n[$1];for(i=2;i<=NF;i++)a[$1,i]+=$i}
    END{for(x in n){
        printf "%s ", x
        for(y=2;y<=NF;y++)printf "%s%s", a[x,y],(y==NF?ORS:FS)
        }
    }' file

this awk command should work with your 100 columns file.

test with your file:

kent$  cat f
a1 1   1   2   2
a2 2   5   3   7
a2 2   3   3   8
a3 1   4   6   1
a3 1   7   9   4
a3 1   2   4   2

kent$  awk '{n[$1];for(i=2;i<=NF;i++)a[$1,i]+=$i}END{for(x in n){printf "%s ", x;for(y=2;y<=NF;y++)printf "%s%s", a[x,y],(y==NF?ORS:OFS)}}' f
a1 1 1 2 2
a2 4 8 6 15
a3 3 13 19 7
like image 117
Kent Avatar answered Apr 25 '23 03:04

Kent


Using arrays of arrays in gnu awk version 4

awk '{for (i=2;i<=NF;i++) a[$1][i]+=$i}
END{for (i in a) 
      { printf i FS;
        for (j in a[i]) printf a[i][j] FS 
        printf RS}
    }' file     

a1 1 1 2 2
a2 4 8 6 15
a3 3 13 19 7
like image 20
BMW Avatar answered Apr 25 '23 03:04

BMW