I have a file that has a name in the first column and count in the second column. It is sorted by name.
dan 3355
dan 667
dan 889
frank 8
frank 99
frank 90
ian 9
I would like to combine all the same names and output the total count for each name:
dan 4911
frank 197
ian 9
I know that I can use uniq for getting a total count of the identical lines, but how can I preserve the counts that I have in my data?
You can make use of awk's associative array:
awk '{arr[$1]+=$2;} END {for (i in arr) print i, arr[i]}' filename
Using awk's associative memory does not guarantee that names will appear in output in the same order as in input (and may be memory inefficient for large data sets).
Use the following instead
awk '(NR==1){oldname=$1;s=$2;next};
(oldname == $1){s=s+$2;next};
{print oldname, s;oldname=$1s=$2;next}
END{print oldname,s}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With