Im trying to learn awk at the moment and I want to do a specific task. My question is similar in scope to one previously posted(Using awk to transpose column to row), but wouldn't quite work for my data. I have been trying to work out why and im sure its quite simple.
I have large data in a tab delimited table with only two fields (example below):
1101\t7778
1101\t7755
1101\t8889
1101\t6789
2300\t1220
4000\t2333
4000\t7555
4000\t9000
4000\t1111
and I want to end up appending the second field onto a row when the field matches. The desired output would be:
1101\t7778\t7755\t8889\t6789
2300\t1220
4000\t2333\t7555\t9000\t1111
If possible, Id like to get an explaination of all the parts within the command so I can understand it in the future. Thanks in advance.
awk ' { list[$1] = list[$1] "\t" $2 }
END { for (i in list) printf "%s%s\n", i, list[i] }' data
The first line adds a tab and the second field to the list
element indexed by $1
. The second line prints out the key and the accumulated list of values.
Sample output:
1101 7778 7755 8889 6789
4000 2333 7555 9000 1111
2300 1220
If you want the first column sorted, you can pipe the output through sort -n
. If you have GNU awk
, you can investigate the built-in sort function too:
/usr/gnu/bin/awk ' { list[$1] = list[$1] "\t" $2 }
END { n = asorti(list, indexes);
for (i = 1; i <= n; i++)
printf "%s%s\n", indexes[i], list[indexes[i]]
}' data
Sorted output:
1101 7778 7755 8889 6789
2300 1220
4000 2333 7555 9000 1111
For abasu's request a pure bash
version:
#!/bin/bash
declare -A hash
while read x y; do
hash[$x]=${hash[$x]}"\t"$y
done <<XXX
1101 7778
1101 7755
1101 8889
1101 6789
2300 1220
4000 2333
4000 7555
4000 9000
4000 1111
XXX
for i in ${!hash[*]}; { echo -e $i${hash[$i]};}
Output:
2300 1220
1101 7778 7755 8889 6789
4000 2333 7555 9000 1111
In the here-is-the-document there is a tab character between the columns, as well as in between the output columns. If the -e
is removed from the last line after echo
the output is:
2300\t1220
1101\t7778\t7755\t8889\t6789
4000\t2333\t7555\t9000\t1111
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With