Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why should we use n-way merge? what are its advantages over 2-way merge?

I tried to read few articles on n-way merge, but did not understand the concept. I am confused on why would you use n-way merge over 2-way merge? Like why would you divide array in 3 parts, sort them then do 2-way merge of 2 parts and then 2-way merge of 3rd part with this merged 2 parts :)

Thanks

like image 279
ADJ Avatar asked Feb 05 '13 17:02

ADJ


People also ask

What is n way merge?

A simple approach to Merging k sorted arrays (each of length n) requires O(n k^2) time and not O(nk) time. As when you merge first 2 arrays it takes 2n time, then when you merge third with the output , it takes 3n time as now we are merging two array of length 2n and n.

Is 3 way merge sort better than 2 way merge sort?

The 3-way merge sort is similar to the two way merge sort. The only difference is the array here is divided into three parts till we get all one element arrays. The procedure of merging will also be similar to that of the merge sort but at a time three array elements are compared and inserted into the final array.

What is 2 way merge?

two-way merge An algorithm that merges two ordered files into one single sorted file. It may be viewed as a generalization of sorting by insertion, and was proposed by John von Neumann in 1945. A Dictionary of Computing. "two-way merge ."

What is the difference between three-way merge and normal merge?

In a "normal" merge sort, you divide the array by 2, until reaching a depth of log 2 n and then start merging. Each merge of two arrays of size m would also take 2m operations. Now if you do a three-way merge, you will divide the array by 3.

Is there a two-way merge tool?

Actually, there is no such thing as a two-way merge, only tools that diff two files and allow you to "merge" by picking chunks from one file or the other. Only a 3-way merge gives you the ability to know whether or not a chunk is a change from the origin and whether or not changes conflict.

Is three-way merge sort better than log 2 N = 1000000?

Asymptotically, these two are both Θ (nlogn). However, perhaps (I haven't tried) in practice the three-way merge sort would give better performance because of its log 3 n. Nevertheless, since log 2 n for n = 1000000 is a mere 20, and log 3 n for the same number is 12.5, I doubt this optimization would be really effective unless n is quite large.

What is a three way merge in C++?

A three way merge where two changesets to one base file are merged as they are applied, as opposed to applying one, then merging the result with the other. For example, having two changes where a line is added in the same place could be interpreded as two additions, not a change of one line.


1 Answers

You'd typically end up with multiple streams to merge when you're doing an external sort. For example, let's assume you need to sort a terabyte of data, and have only (say) 64 gigabytes of RAM.

You'd typically do that by reading in 64 gigabytes, sorting it, then writing it out. Repeat for the full terabyte of data, producing one intermediate file for each "chunk" you can hold in memory at once. There are ways to improve this, but about the best you can typically hope for is that you produce sorted intermediate files of around 128 gigabytes each.

That leaves you with a number of intermediate files to merge together -- and the number will almost certainly be greater than 2.

If you're doing to do this on a regular basis, you probably have some pretty high-end hardware to do it with. If you've put each intermediate file on a separate disk drive, (and have at least one more for your output) you can almost certainly improve speed by merging all the data together at once, instead of only two at a time. The process will typically be I/O bound, so reading from (say) 8 disks at a time will typically be around 4 times as fast as reading from only 2 disks at a time (though this depends on your output disk having that much bandwidth, which may not be true). By avoiding creating more intermediate files (that will require further merging) your overall speed will probably improve by an even larger factor.

like image 140
Jerry Coffin Avatar answered Sep 21 '22 04:09

Jerry Coffin