In a shell script I'm trying to sort a CSV file. Some fields may contain the separator and are quoted to handle this correctly. Let's say I have a file with:
"2",D,Clair
1,R,Alice
"3","F","Dennis"
2,"P,F",Bob
I want to sort this on the first colum, then the third. The result should be:
1,R,Alice
2,"P,F",Bob
"2",D,Clair
"3","F","Dennis"
There may also be escaped double quotes in the fields. In general, the CSV will conform to RFC 4180.
I tried to do this with a sort -t , -k 1,1 -k 3,3
but that doesn't work, because sort
isn't aware of the special meaning of quotes in CSV. I couldn't find a way to make sort
behave this way. Perhaps I should use another command, but I can't find any.
How to sort my CSV?
I'd use the excellent xsv for the job:
$ xsv sort --no-headers --select 1,2 input.csv
1,R,Alice
2,D,Clair
2,"P,F",Bob
3,F,Dennis
csvkit can also do it:
$ csvsort --no-header-row --columns 1,2 input.csv
a,b,c
1,R,Alice
2,D,Clair
2,"P,F",Bob
3,F,Dennis
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With