I was studying the merge-sort subject that I ran into this concept that the number of comparisons in merge-sort (in the worst-case, and according to Wikipedia) equals (n &lceil;lg n&rceil; - 2&lceil;lg n&rceil; + 1); in fact it's between (n lg n - n + 1) and (n lg n + n + O(lg n)). The problem is that I cannot figure out what these complexities try to say. I know O(nlogn) is the complexity of merge-sort but the number of comparisons?

<h3>Why to count comparisons</h3> There are basically two operations to any sorting algorithm: comparing data and moving data. In many cases, comparing will be more expensive than moving. Think about long strings in a reference-based typing system: moving data will simply exchange pointers, but comparing might require iterating over a large common part of the strings before the first difference is found. So in this sense, comparison might well be the operation to focus on. <h3>Why an exact count</h3> The numbers appear to be more detailed: instead of simply giving some Landau symbol (big-Oh notation) for the complexity, you get an actual number. Once you have decided what a basic operation is, like a comparison in this case, this approach of actually counting operations becomes feasible. This is particularly important when comparing the constants hidden by the Landau symbol, or when examining the non-asymptotic case of small inputs. <h3>Why this exact count formula</h3> Note that throughout this discussion, lg denotes the logarithm with base 2. When you merge-sort n elements, you have &lceil;lg n&rceil; levels of merges. Assume you place &lceil;lg n&rceil; coins on each element to be sorted, and a merge costs one coin. This will certainly be enough to pay for all the merges, as each element will be included in &lceil;lg n&rceil; merges, and each merge won't take more comparisons than the number of elements involved. So this is the n&lceil;lg n&rceil; from your formula. As a merge of two arrays of length m and n takes only m + n − 1 comparisons, you still have coins left at the end, one from each merge. Let us for the moment assume that all our array lengths are powers of two, i.e. that you always have m = n. Then the total number of merges is n − 1 (sum of powers of two). Using the fact that n is a power of two, this can also be written as 2&lceil;lg n&rceil; − 1, and subtracting that number of returned coins from the number of all coins yields n&lceil;lg n&rceil; − 2&lceil;lg n&rceil; + 1 as required. If n is 1 less than a power of two, then there are &lceil;lg n&rceil; merges where one element less is involved. This includes a merge of two one-element lists which used to take one coin and which now disappears altogether. So the total cost reduces by &lceil;lg n&rceil;, which is exactly the number of coins you'd have placed on the last element if n were a power of two. So you have to place fewer coins up front, but you get back the same number of coins. This is the reason why the formula has 2&lceil;lg n&rceil; instead of n: the value remains the same unless you drop to a smaller power of two. The same argument holds if the difference between n and the next power of two is greater than 1. On the whole, this results in the formula given in Wikipedia: <blockquote> n &lceil;lg n&rceil; − 2&lceil;lg n&rceil; + 1 </blockquote> Note: I'm pretty happy with the above proof. For those who like my formulation, feel free to distribute it, but don't forget to attribute it to me as the license requires. <h3>Why this lower bound</h3> To proove the lower bound formula, let's write &lceil;lg n&rceil; = lg n + d with 0 ≤ d < 1. Now the formula above can be written as n (lg n + d) − 2lg n + d + 1 = n lg n + nd − n2d + 1 = n lg n − n(2d − d) + 1 ≥ n lg n − n + 1 where the inequality holds because 2d − d ≤ 1 for 0 ≤ d < 1 <h3>Why this upper bound</h3> I must confess, I'm rather confused why anyone would name n lg n + n + O(lg n) as an upper bound. Even if you wanted to avoid the floor function, the computation above suggests something like n lg n − 0.9n + 1 as a much tighter upper bound for the exact formula. 2d − d has its minimum (ln(ln(2)) + 1)/ln(2) ≈ 0.914 for d = −ln(ln(2))/ln(2) ≈ 0.529. I can only guess that the quoted formula occurs in some publication, either as a rather loose bound for this algorithm, or as the exact number of comparisons for some other algorithm which is compared against this one. <hr> <h3>(Two different counts)</h3> This issue has been resolved by the comment below; one formula was originally quoted incorrectly. <blockquote> equals (n lg n - n + 1); in fact it's between (n lg n - n + 1) and (n lg n + n + O(lg n)) </blockquote> If the first part is true, the second is trivially true as well, but explicitely stating the upper bound seems kind of pointless. I haven't looked at the details myself, but these two statements appear strange when taken together like this. Either the first one really is true, in which case I'd omit the second one as it is only confusing, or the second one is true, in which case the first one is wrong and should be omitted.

Number of Comparisons in Merge-Sort

Tags:

merge

algorithm

complexity-theory

mergesort

I was studying the merge-sort subject that I ran into this concept that the number of comparisons in merge-sort (in the worst-case, and according to Wikipedia) equals (n ⌈lg n⌉ - 2^{⌈lg n⌉} + 1); in fact it's between (n lg n - n + 1) and (n lg n + n + O(lg n)). The problem is that I cannot figure out what these complexities try to say. I know O(nlogn) is the complexity of merge-sort but the number of comparisons?

753

asked Sep 10 '12 05:09

Shahin

1 Answers

Why to count comparisons

There are basically two operations to any sorting algorithm: comparing data and moving data. In many cases, comparing will be more expensive than moving. Think about long strings in a reference-based typing system: moving data will simply exchange pointers, but comparing might require iterating over a large common part of the strings before the first difference is found. So in this sense, comparison might well be the operation to focus on.

Why an exact count

The numbers appear to be more detailed: instead of simply giving some Landau symbol (big-Oh notation) for the complexity, you get an actual number. Once you have decided what a basic operation is, like a comparison in this case, this approach of actually counting operations becomes feasible. This is particularly important when comparing the constants hidden by the Landau symbol, or when examining the non-asymptotic case of small inputs.

Why this exact count formula

Note that throughout this discussion, lg denotes the logarithm with base 2. When you merge-sort n elements, you have ⌈lg n⌉ levels of merges. Assume you place ⌈lg n⌉ coins on each element to be sorted, and a merge costs one coin. This will certainly be enough to pay for all the merges, as each element will be included in ⌈lg n⌉ merges, and each merge won't take more comparisons than the number of elements involved. So this is the n⌈lg n⌉ from your formula.

As a merge of two arrays of length m and n takes only m + n − 1 comparisons, you still have coins left at the end, one from each merge. Let us for the moment assume that all our array lengths are powers of two, i.e. that you always have m = n. Then the total number of merges is n − 1 (sum of powers of two). Using the fact that n is a power of two, this can also be written as 2^{⌈lg n⌉} − 1, and subtracting that number of returned coins from the number of all coins yields n⌈lg n⌉ − 2^{⌈lg n⌉} + 1 as required.

If n is 1 less than a power of two, then there are ⌈lg n⌉ merges where one element less is involved. This includes a merge of two one-element lists which used to take one coin and which now disappears altogether. So the total cost reduces by ⌈lg n⌉, which is exactly the number of coins you'd have placed on the last element if n were a power of two. So you have to place fewer coins up front, but you get back the same number of coins. This is the reason why the formula has 2^{⌈lg n⌉} instead of n: the value remains the same unless you drop to a smaller power of two. The same argument holds if the difference between n and the next power of two is greater than 1.

On the whole, this results in the formula given in Wikipedia:

n ⌈lg n⌉ − 2^{⌈lg n⌉} + 1

Note: I'm pretty happy with the above proof. For those who like my formulation, feel free to distribute it, but don't forget to attribute it to me as the license requires.

Why this lower bound

To proove the lower bound formula, let's write ⌈lg n⌉ = lg n + d with 0 ≤ d < 1. Now the formula above can be written as
n (lg n + d) − 2^{lg n + d} + 1 = n lg n + nd − n2^d + 1 = n lg n − n(2^d − d) + 1 ≥ n lg n − n + 1
where the inequality holds because 2^d − d ≤ 1 for 0 ≤ d < 1

Why this upper bound

I must confess, I'm rather confused why anyone would name n lg n + n + O(lg n) as an upper bound. Even if you wanted to avoid the floor function, the computation above suggests something like n lg n − 0.9n + 1 as a much tighter upper bound for the exact formula. 2^d − d has its minimum (ln(ln(2)) + 1)/ln(2) ≈ 0.914 for d = −ln(ln(2))/ln(2) ≈ 0.529.

I can only guess that the quoted formula occurs in some publication, either as a rather loose bound for this algorithm, or as the exact number of comparisons for some other algorithm which is compared against this one.

(Two different counts)

This issue has been resolved by the comment below; one formula was originally quoted incorrectly.

equals (n lg n - n + 1); in fact it's between (n lg n - n + 1) and (n lg n + n + O(lg n))

If the first part is true, the second is trivially true as well, but explicitely stating the upper bound seems kind of pointless. I haven't looked at the details myself, but these two statements appear strange when taken together like this. Either the first one really is true, in which case I'd omit the second one as it is only confusing, or the second one is true, in which case the first one is wrong and should be omitted.

110

answered Sep 25 '22 01:09

MvG

Related questions
                            
                                Comparing Root-finding (of a function) algorithms in Python
                            
                                Python: create a polynomial of degree n
                            
                                Design an algorithm, find the most frequently used word in a book
                            
                                A.I. that can navigate a randomly generated 2D city
                            
                                Finding a Minimum Spanning Tree from an Adjacency List where the Adjacency List is in a string array using Prims Algorithm
                            
                                Jitter buffer implementation in Java
                            
                                Searching for a "needle" in a two dimnesional "haystack" [closed]
                            
                                Check if 2 arrays are similar without hashing or sorting
                            
                                All the paths between 2 nodes in graph
                            
                                Levenshtein algorithm with custom character mapping
                            
                                Dealing with massive number of Rules (conditions and constraints) CEP systems
                            
                                array median transformation minimum steps
                            
                                Python implementation of "median of medians" algorithm
                            
                                algorithm for shortest weighted path - frequently changing edges
                            
                                Closest pair sum in two sorted arrays
                            
                                How does the Failure function used in KMP algorithm work?
                            
                                Limit for quadratic probing a hash table
                            
                                How do I implement a lexer given that I have already implemented a basic regular expression matcher?
                            
                                Tree satisfying BST property But I think it is not BST?
                            
                                Plain English explanation of Theta notation?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With