Whats the best way to implement an N way merge for N sorted files?
Lets say I have 9 sorted files with 10 records each? How do I merge these files to create a big file with 90 sorted records?
I'm assuming that there could be a lot more data then you gave in your example. If you can open all the files simultaneously you can use this algorithm:
Note that you don't have to read all the files into memory at once, so this will work well if you have a reasonable number of large files, but not if you have a lot of small files.
If you have a lot of small files, you should merge them in groups to make a single output file for each group, then repeat the process to merge these new groups.
In C# you can use for example a SortedDictionary
to implement the priority queue.
Addressing the comments in the other answer:
If you have an variable number of files, here's what I'd do. This is just a sketch to get the idea across; this code doesn't compile, I've gotten the method names wrong, and so on.
// initialize the data structures
var priorityQueue = new SortedDictionary<Record, Stream>();
var streams = new List<Stream>();
var outStream = null;
try
{
// open the streams.
outStream = OpenOutputStream();
foreach(var filename in filenames)
streams.Add(GetFileStream(filename));
// initialize the priority queue
foreach(var stream in streams)
{
var record = ReadRecord(stream);
if (record != null)
priorityQueue.Add(record, stream);
// the main loop
while(!priorityQueue.IsEmpty)
{
var record = priorityQueue.Smallest;
var smallestStream = priorityQueue[record];
WriteRecord(record, outStream);
priorityQueue.Remove(record);
var newRecord = ReadRecord(smallestStream);
if (newRecord != null)
priorityQueue.Add(newRecord, smallestStream);
}
}
finally { clean up the streams }
Does that make sense? You just keep on grabbing the smallest thing out of the priority queue and replacing it with the next record in that stream, if there is one. Eventually the queue will be empty and you'll be done.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With