Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash: sort csv file by first 4 columns

I have a csv file with fields delimited by ";". There are 8 fields, and I want to sort my data by the first 4 columns, in increasing order (first sort by column 1, then column 2, etc)

How I can do this from a command line in linux?

I tried with open office, but it only lets me select 3 columns.

EDIT: among the fields on which I want to sort my data, three fields contain strings with numerical values, one only strings. How can I specify this with the sort command?

like image 388
Ricky Robinson Avatar asked Aug 13 '12 13:08

Ricky Robinson


People also ask

How do I sort a CSV file in shell script?

You need to use two options for the sort command: --field-separator (or -t ) --key=<start,end> (or -k ), to specify the sort key, i.e. which range of columns (start through end index) to sort by. Since you want to sort on 3 columns, you'll need to specify -k 3 times, for columns 2,2 , 1,1 , and 3,3 .

How do I sort a column in a CSV file?

To sort CSV by multiple columns, use the sort_values() method. Sorting by multiple columns means if one of the columns has repeated values, then the sort order depends on the 2nd column mentioned under sort_values() method.

How do I sort in bash?

Using Bash Sort to Order Files by Size To sort files by size, pass the -S flag to tell the ls command to sort the list of files by file size. Run the command below to list files ( ls ) sorted by file size in a long list format ( -lS ).

How do I sort multiple columns in Linux?

Sorting by Columns It is possible to sort files/streams on a single column, multiple columns, or ranges of columns. To do this use the -k option to indicate these columns. The default delimiter used to identify columns is white space (blank characters). If you need to use a different column delimiter use the -t option.


2 Answers

sort -k will allow you to define the sort key. From man sort:

-k, --key=POS1[,POS2]
       start a key at POS1 (origin 1), end it at POS2 (default end of line). 

So

$ sort -t\; -k1,4

should do it. Note that I've escaped the semi-colon, otherwise the shell will interpret it as an end-of-statement.

like image 98
Brian Agnew Avatar answered Oct 19 '22 03:10

Brian Agnew


Try:

sort -t\; -k 1,1n -k 2,2n -k 3,3n -k 4,4n test.txt

eg:

1;2;100;4
1;2;3;4
10;1;2;3
9;1;2;3

> sort -t\; -k 1,1n -k 2,2n -k 3,3n -k 4,4n temp3
1;2;3;4
1;2;100;4
9;1;2;3
10;1;2;3
like image 21
Vijay Avatar answered Oct 19 '22 04:10

Vijay