Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join lines with the same value in the first column

Tags:

bash

awk

I have a tab-delimited file with three columns (excerpt):

AC147602.5_FG004    IPR000146   Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase
AC147602.5_FG004    IPR023079   Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001    IPR002110   Ankyrin repeat
AC148152.3_FG001    IPR026961   PGG domain

and I'd like to get this using bash:

AC147602.5_FG004 IPR000146 Fructose-1,6-bisphosphatase class 1/Sedoheputulose-1,7-bisphosphatase IPR023079 Sedoheptulose-1,7-bisphosphatase
AC148152.3_FG001 IPR023079 Sedoheptulose-1,7-bisphosphatase IPR002110   Ankyrin repeat IPR026961    PGG domain

So if ID in the first column are the same in several lines, it should produce one line for each ID with all other parts of lines joined. In the example it will give two-row file.

like image 508
boczniak767 Avatar asked Nov 06 '13 22:11

boczniak767


1 Answers

give this one-liner a try:

 awk -F'\t' -v OFS='\t' '{x=$1;$1="";a[x]=a[x]$0}END{for(x in a)print x,a[x]}' file
like image 84
Kent Avatar answered Oct 22 '22 16:10

Kent