Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to join 2 csv files with a shell script?

I'm trying to make a shell script that will combine two csv files in the following way:

I have two csv files, f1.csv and f2.csv. The format of f1.csv is:

startId, endId, roomNum

f2.csv has a format like this:

startId, endId, teacherId 

I want to combine these two into one csv file with this format:

startId, endId, roomNum, teacherId. 

What is the best way to accomplish this with a shell script that runs under Linux?

like image 338
tony Huang Avatar asked Oct 11 '22 19:10

tony Huang


1 Answers

Try:

join -t, -1 1 -2 1 -o 1.2 1.3 1.4 2.4 <(awk -F, '{print $1":"$2","$0}' f1.csv | sort) <(awk -F, '{print $1":"$2","$0}' f2.csv | sort)

How it works:

1) I first create a composite key column, by joining the startId and endId into startId:endId for both files.

awk -F, '{print $1":"$2","$0}' f1.csv
awk -F, '{print $1":"$2","$0}' f2.csv

2) I sort both outputs:

awk -F, '{print $1":"$2","$0}' f1.csv | sort 
awk -F, '{print $1":"$2","$0}' f2.csv | sort 

3) I then use the join command to join on my composite key (in the first column) and output just the columns I need.

like image 156
dogbane Avatar answered Oct 13 '22 10:10

dogbane