I've two files of 1 GB each containing only numbers in sorted order. Now I know how to read the contents of the files and sort them using merge sort algorithm and output it into an another file but what I'm interested is to how to do this only using 100MB buffer size (I do not worry about the scratch space). For example one way is to read 50 MB chunks from both the files and sort it and as it is sorted I could read a new element and continue the process till I reach the end of both files (Can anyone give me any idea how to implement this).
Sounds like you only need to merge the numbers in your files, not sort them, since they're already sorted in each file. The merge
part of merge sort is this:
function merge(left,right)
var list result
while length(left) > 0 or length(right) > 0
if length(left) > 0 and length(right) > 0
if first(left) ≤ first(right)
append first(left) to result
left = rest(left)
else
append first(right) to result
right = rest(right)
else if length(left) > 0
append left to result
break
else if length(right) > 0
append right to result
break
end while
return result
Now you can just read the first 50 MB of numbers from both files in two buffers, apply the merge algorithm, then when one of the buffers has been exhausted (all its numbers analysed), read another 50 MB from the needed file. There's no need to sort anything.
You just need a condition that checks when one of your buffers is empty. When it is, read more from the file that buffer is associated with.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With