What's a good algorithm for sorting text files that are larger than available memory (many 10s of gigabytes) and contain variable-length records? All the algorithms I've seen assume 1) data fits in memory, or 2) records are fixed-length. But imagine a big CSV file that I wanted to sort by the "BirthDate" field (the 4th field):
Id,UserId,Name,BirthDate
1,psmith,"Peter Smith","1984/01/01"
2,dmehta,"Divya Mehta","1985/11/23"
3,scohen,"Saul Cohen","1984/08/19"
...
99999999,swright,"Shaun Wright","1986/04/12"
100000000,amarkov,"Anya Markov","1984/10/31"
I know that:
Thanks! ♥
For larger data sets it proves to be inefficient so algorithms like merge sort are preferred in that case. Quick Sort is an in-place sort (i.e. it doesn't require any extra storage) so it is appropriate to use it for arrays.
External merge sort. One example of external sorting is the external merge sort algorithm, which is a K-way merge algorithm. It sorts chunks that each fit in RAM, then merges the sorted chunks together.
Quicksort. Quicksort is one of the most efficient sorting algorithms, and this makes of it one of the most used as well. The first thing to do is to select a pivot number, this number will separate the data, on its left are the numbers smaller than it and the greater numbers on the right.
Although there's no straightforward way to sort a text file, we can achieve the same net result by doing the following: 1) Use the FileSystemObject to read the file into memory; 2) Sort the file alphabetically in memory; 3) Replace the existing contents of the file with the sorted data we have in memory.
This class of algorithms is called external sorting. I would start by checking out the Wikipedia entry. It contains some discussion and pointers.
Suggest the following resources:
Merge Sort: http://en.wikipedia.org/wiki/Merge_sort
Seminumerical Algorithms, vol 2 of The Art of Computer Programming: Knuth: Addison Wesley:ISBN 0-201-03822-6(v.2)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With