I'm trying to make a shell script that will combine two csv files in the following way:
I have two csv files, f1.csv and f2.csv. The format of f1.csv is:
startId, endId, roomNum
f2.csv has a format like this:
startId, endId, teacherId
I want to combine these two into one csv file with this format:
startId, endId, roomNum, teacherId.
What is the best way to accomplish this with a shell script that runs under Linux?
Try:
join -t, -1 1 -2 1 -o 1.2 1.3 1.4 2.4 <(awk -F, '{print $1":"$2","$0}' f1.csv | sort) <(awk -F, '{print $1":"$2","$0}' f2.csv | sort)
How it works:
1) I first create a composite key column, by joining the startId and endId into startId:endId for both files.
awk -F, '{print $1":"$2","$0}' f1.csv
awk -F, '{print $1":"$2","$0}' f2.csv
2) I sort both outputs:
awk -F, '{print $1":"$2","$0}' f1.csv | sort
awk -F, '{print $1":"$2","$0}' f2.csv | sort
3) I then use the join
command to join on my composite key (in the first column) and output just the columns I need.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With