Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to join multiple txt files into based on column?

Tags:

linux

bash

join

I have txt files, all of which are in the same directory. Each one has 2 columns of data. They look like this:

Label1 DataA1
Label2 DataA2
Label3 DataA3

I would like to use join to make a one large file like this.

Label1 DataA1 DataB1 DataC1
Label2 DataA2 DataB2 DataC2
Label3 DataA3 DataB3 DataC3

Currently, I have

join fileA fileB | join - fileC

However, I have too many files to make it practical to list all of them - is there a way to write a loop for this sort of command?

like image 943
Justin Avatar asked Aug 09 '13 17:08

Justin


1 Answers

With awk you could do it like this:

awk 'NF > 0 { a[$1] = a[$1] " " $2 } END { for (i in a) { print i a[i]; } }' file*

If you want to sort your files:

find -type f -maxdepth 1 -name 'file*' -print0 | sort -z | xargs -0 awk 'NF > 0 { a[$1] = a[$1] " " $2 } END { for (i in a) { print i a[i]; } }' 

Sometimes for (i in a) populates the keys not in the order that they were added so you could also sort it but this is only available in gawk. The idea of mapping keys in an indexed array for the order is only possible if column 1 doesn't have differences.

gawk 'NF > 0 { a[$1] = a[$1] " " $2 } END { count = asorti(a, b); for (i = 1; i <= count; ++i) { j = b[i]; print j a[j]; } }' ...
like image 138
konsolebox Avatar answered Nov 03 '22 19:11

konsolebox