Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Awk conditional sum from a CSV file

Tags:

csv

awk

I have a CSV file from which I would like to extract some pieces of information: for each distinct value in one colum, I would like to compute the sum of the corresponding values in another column. Eventually, I may do it in Python, but I believe there could be a simple solution using awk.

This could be the CSV file:

2    1:2010-1-bla:bla    1.6
2    2:2010-1-bla:bla   1.1
2    2:2010-1-bla:bla    3.4
2    3:2010-1-bla:bla    -1.3
2    3:2010-1-bla:bla    6.0
2    3:2010-1-bla:bla    1.1
2    4:2010-1-bla:bla    -1.0
2    5:2010-1-bla:bla    10.9

I would like to get:

1    1.6
2    4.5
3    5.8
4    -1.0
5    10.9

For now, I can only extract:

a) the values of the first colum:

awk -F ' ' '{print $(2)}' MyFile.csv | awk -F ':' '{print $(1)}'

and then get:

1
2
2
3
3
3
4
5

b) and the values equal to, say, 1.1 in the last column with:

awk -F ' ' '{print $(NF)}' MyFile.csv | awk '$1 == 1.1'

and then get:

1.1
1.1

I am not able to simultaneously extract the columns I am interested in, which may help me in the end. Here is a sample output which may ease the computation of the sums (I don't know):

1    1.6
2    1.1
2    3.4
3    -1.3
3    6.0
3    1.1
4    -1.0
5    10.9

Edit: Thanks to Elenaher, we could say the input is the file above.

like image 938
Wok Avatar asked Oct 14 '10 14:10

Wok


1 Answers

$ awk -F"[: \t]+" '{a[$2]+=$NF}END{for(i in a ) print i,a[i] }' file
4 -1
5 10.9
1 1.6
2 4.5
3 5.8
like image 57
ghostdog74 Avatar answered Oct 05 '22 04:10

ghostdog74