Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort a CSV with quoted fields (that may contain the separator)

In a shell script I'm trying to sort a CSV file. Some fields may contain the separator and are quoted to handle this correctly. Let's say I have a file with:

"2",D,Clair
1,R,Alice
"3","F","Dennis"
2,"P,F",Bob

I want to sort this on the first colum, then the third. The result should be:

1,R,Alice
2,"P,F",Bob
"2",D,Clair
"3","F","Dennis"

There may also be escaped double quotes in the fields. In general, the CSV will conform to RFC 4180.

I tried to do this with a sort -t , -k 1,1 -k 3,3 but that doesn't work, because sort isn't aware of the special meaning of quotes in CSV. I couldn't find a way to make sort behave this way. Perhaps I should use another command, but I can't find any.

How to sort my CSV?

like image 946
Rinke Avatar asked Dec 19 '17 22:12

Rinke


1 Answers

I'd use the excellent xsv for the job:

$ xsv sort --no-headers --select 1,2 input.csv
1,R,Alice
2,D,Clair
2,"P,F",Bob
3,F,Dennis

csvkit can also do it:

$ csvsort --no-header-row --columns 1,2 input.csv
a,b,c
1,R,Alice
2,D,Clair
2,"P,F",Bob
3,F,Dennis
like image 182
aude Avatar answered Oct 14 '22 19:10

aude