Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I sort a very large CSV file?

Tags:

csv

I have this large 294,000 row csv with urls in column 1 and numbers in column 2.

I need to sort them from the smallest number to the largest number. I have loaded it into the software 'CSVed' and it handles it okay, it doesn't crash or anything but when I click the top of the column to sort it, it doesn't make it in order from smallest to largest, it's all just muddled up.

Anyone have any ideas? I've been searching around all day, I thought I might ask here.

Thanks.

like image 679
Ray Lovelock Avatar asked Jan 02 '17 08:01

Ray Lovelock


2 Answers

If you have access to a unix system (and your urls don't have commas in them) this should do the trick:

sort -t',' -n -k2 filename

Where -t says columns are delimited by commas, -n says the data is numeric, and -k2 says to sort based on the second column.

like image 84
user12341234 Avatar answered Oct 03 '22 20:10

user12341234


You can use gnu sort. It takes has small memory footprint and can even use multiple CPUs for sort.

sort -t ,  -k 2n file.csv

Gnu sort is available by default in most of linux distributions as well as for MacOS by default (though later has slightly different options). You can install it for windows as well, for example from CoreUtils for Windows page.

For more information about sort invocation use the manual

like image 45
Robert Navado Avatar answered Oct 03 '22 20:10

Robert Navado