I need to sort huge binary files that won't fit into memory. It's no option to use a sort algorithm and continuously read/write from I/O device. Is there any possibility to use something like a memory mapped file?
This is a solved problem, as explained on this wiki page: http://en.wikipedia.org/wiki/External_sorting
Basically, read in some set amount, sort it, save into a file, and repeat. Then, read in a smaller amount from each file, sort these, and continue until done.
UPDATE:
You may want to look at the java code he uses, it sounds like he solved what you need.
http://www.codeodor.com/index.cfm/2007/5/10/Sorting-really-BIG-files/1194
One strategy is to sort chunks of it with quick sort or some other fast memory sort algorithm and then do a merge sort of these chunks.
Here a nice solution with C++11:
https://github.com/alveko/external_sort
And some other options:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With