This isn't working like I expect, despite all research. I must be missing something...
File 1...
# cat file1.csv
1 123 JohnDoe
1 456 BobDylan
1 789 BillyJean
File 2...
# cat file2.csv
111 123 DaddyDoe
222 456 DaddyDylan
666 777 Stranger
555 789 DaddyJean
444 888 Stranger
333 999 Stranger
I am trying to join on both the second fields. When I perform a left outer join and only include fields from the first file, everything seems dandy.
# join -1 2 -2 2 -a 1 -o 1.2 1.3 file1.csv file2.csv
123 JohnDoe
456 BobDylan
789 BillyJean
But as soon as I include a field from the second file, it all goes wack.
# join -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
DaddyDoeoe
DaddyDylann
789 BillyJean DaddyJean
The last line looks perfect! What's up with the others? Any idea? Thanks in advance!
EDIT: Here is my attempt with actual CSVs.
# cat file1.csv
1,123,JohnDoe
1,456,BobDylan
1,789,BillyJean
# cat file2.csv
111,123,DaddyDoe
222,456,DaddyDylan
666,777,Stranger
555,789,DaddyJean
444,888,Stranger
333,999,Stranger
# join -t, -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
,DaddyDoeoe
,DaddyDylann
789,BillyJean,DaddyJean
You used the -a
option.
-a
file_numberIn addition to the default output, produce a line for each unpairable line in file file_number.
In addition, the odd overwriting behavior indicates that you have embedded carriage returns (\r
). I would examine those fies closely with cat -v
or a text editor that doesn't try to be "smart" about Windows files.
Use the correct 'field' separator in your command.
When I changed your data to true csv, and used
join -t, -1 2 -2 2 -a 1 -o 1.2 1.3 2.3 file1.csv file2.csv
# ---^^^
I got
123,JohnDoe,DaddyDoe
456,BobDylan,DaddyDylan
789,BillyJean,DaddyJean
I hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With