I am developing an application using java, but for that I need a csv file in some order. I dont know linux much, but wondering if some way is there to merge the csv files in the required format.
I have two csv files containing hundreds of thousands of records. a sample is below:
name,Direction,Date
abc,sent,Jan 21 2014 02:06
xyz,sent,Nov 21 2014 01:09
pqr,sent,Oct 21 2014 03:06
and
name,Direction,Date
abc,received,Jan 22 2014 02:06
xyz,received,Nov 22 2014 02:06
so, this second csv file would contain some records of file 1. What I need is a new csv like this:
name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06
Need to add the columns(4th and 5th column) according to the matching data in column1. if no matching data is there in the second file, the columns should be empty like above.
so is there a bash command in linux to achieve this?
awk may work for you:
kent$ awk -F, -v OFS=","
'BEGIN{print "name,Direction,Date,currentDirection,receivedDate"}
NR==FNR&&NR>1{a[$1]=$0;next}
FNR>1{printf "%s%s\n",$0,($1 in a?FS a[$1]:"")}' 2.csv 1.csv
name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,abc,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,xyz,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06
kent$ awk -F, -v OFS="," 'BEGIN{print "name,Direction,Date,currentDirection,receivedDate"}
NR==FNR&&NR>1{a[$1]=$2 FS $3;next}
FNR>1{printf "%s%s\n",$0,($1 in a?FS a[$1]:"")}' 2.csv 1.csv
name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06
You can use this join
command to accomplish this. The first file is 1.csv
and the second file is 2.csv
join -1 1 -2 1 -t, -a 1 1.csv 2.csv | sed "s/Direction,Date/currentDirection,receivedDate/2"
Output:
name,Direction,Date,currentDirection,receivedDate
abc,sent,Jan 21 2014 02:06,received,Jan 22 2014 02:06
xyz,sent,Nov 21 2014 01:09,received,Nov 22 2014 02:06
pqr,sent,Oct 21 2014 03:06
Explanation:
You want to join on the first field in both files, therefore -1 1 -2 1
You want to use comma, therefore -t,
You want to display all unmatched records in file 1, therefore -a 1
, you could also append -a 2
if needed.
The /2 in the sed command tells sed
to replace the 2nd occurrence
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With