quick sort algorithm improvement if more duplicate keys

Q: Does Quick sort work with duplicates?

When the list of items to be sorted contains a lot of duplicate values, we can improve QuickSort by grouping all the values that are equal to the pivot to the middle and then we recursively QuickSort those values on the left and those values on the right.

Q: How can you improve the performance of the quicksort algorithm?

Instead of picking the first element as pivot every time, if we pick a random element from the unexplored array and swap it with the first element and then perform the partitioning procedure (any one of the above two), then it will improve the expected or average time complexity to O(N*log N).

Q: When implementing quick sort if the array contains lots of duplicates?

When implementing quicksort, if the array contains lots of duplicates, it may be better to perform a three-way partition (into elements less than, equal to, and greater than the pivot), to make smaller recursive calls. Assume three-way comparisons, as provided by the compareTo method.

Tags:

c++

algorithm

quicksort

I am reading quick sort algorithm in book by Robert Sedwick Algorithms and data structures part 1-4.

template <class item>

static void quicksort(item [] a, int l, int r)
{
    if(r <= l) return;
    int i = partition(a,l,r);
    quicksort(a, l, i-1);
    quicksort(a, i+1, r);
}

template <class item>
int partition(item a[], int l, int r)
{
    int i = l-1; j = r; item v = a[r];

    for(;;) {
        while( a[++i] < v );
        while( v < a[--j] ) 
            if( j == l ) 
                break;
        if( i >= j) 
            break;  // pointer crossing.
        exch(a[i], a[j]);
    }

    exch(a[i], a[r]);
    return i;
}

Book has following text on above algorithm.

When duplicate keys are present in the file, the pointer crossing is subtle. we could improve the partitioning process slightly by terminating the scans when i < j, and then using j, rather than i-1, to delimit the right end of the left subfile for the first recursive call. Letting the loop iterate one more time in this case is an improvement, because, when ever the scanning loops terminate with j and i referring to the same element, we end up with two elements in their final positions: the element that stopped both scans, which must therefore be equal to the partitioning element, and the partitioning element itself. This change is probably worth making, because, in this particular case, the program leaves a record with a key equal to the partitioning key in a[r], and that makes the first partition in the call quick-sort(a, i+1, r) degenerate, because its right most key is its smallest.

My questions are

How can we modify above program with description below? I am having difficulty in modifying it to understand text.
Why above quick sort algorithm does not work effectively if more duplicate keys are present.
How above modification improve if more duplication keys are present?
What does author mean by "he first partition in the call quick-sort(a, i+1, r) degenerate, because its right most key is its smallest." ? What does author mean by degenerate here?

Thanks for your time and help.

798

asked Nov 12 '12 06:11

venkysmarty

1 Answers

>>Why above quick sort algorithm does not work effectively if more duplicate keys are present?

It becomes inefficient because your breaking condition is: if(i >= j) break;
so, as you scan from both sides with i and j, It is quite possible that you break when i == j instead of letting i surpass over j.

What bad could possibly happen when we break on i==j when many duplicate keys are present ?

When you break for i==j; from first while loop you must have had a[i] >= v and from second while loop a[j] <=v but since we are considering a 'break' for: i==j so, a[i] = a[j] = v i.e. a[i] is same as v, your pivot element.

In such a scenario, your outermost exch(a[i], a[r]); will simply exchange pivot value to itself.
Hence, in your next recursive call quicksort(a, i+1, r); for Right-half of the array, you would have minimum element sitting at the rightmost end.( your pivot choosing strategy is simply, item v = a[r]; ) and we all know it is bad for QuickSort to choose a pivot element which amounts to the minimum or the maximum of the array. Hence, your subsequent recursive call for right-half will be a degenerate one.
That is why author is advising not to break for i==j but catch them just before that happens.

>>What does author mean by degenerate here?

Degenerate here means, the recursion tree is getting skewed i.e. the subsequent problems are not being generated of nearly equal sizes. You are dividing a problem of size N into something like problems of size N-1 and 1 instead of something more balanced, like dividing it into problems of size N/2 and N/2.

>>How can we modify above program with description below?

We could implement it like following:

int partition(int A[], int l, int r){
        int i=l-1, j=r, v = A[r];
        for(;;){
                while(A[++i] < v);
                while(A[--j] > v)
                        if(j == l)
                                break;
                if(i>=j)
                        break;
                swap(A[i], A[j]);
        }
        if(i == j){// case when we stopped at the pivot element.
                j = j+1;//backtrack j 1 step.
                if(j <= r)
                    swap(A[j], A[r]);
                return j;// partition the subsequent problems around j now.
        }
        swap(A[i], A[r]);
        return i;
}

>>How above modification improve if more duplication keys are present?
It improves the performance by letting you NOT generate an obvious scenario of a degenerate case.

142

answered Sep 30 '22 19:09

srbhkmr

Related questions
                            
                                what is BOOST_ROOT?
                            
                                How can I quickly search all included header files in a project for a specific symbol?
                            
                                What is the proper process for ICMP echo request/reply on unreachable destinations?
                            
                                Member-function pointers and inheritance
                            
                                Best approach for writing a Linux Server in C (phtreads, select or fork ? )
                            
                                How can I get an intrinsic for the exp() function in x64 code?
                            
                                What is the exception specification for a defaulted virtual destructor in C++11?
                            
                                Can't set desired OpenGL version in QGLWidget
                            
                                Writing a console within an application
                            
                                Can I allow multiple programs to read from the same file at the same time?
                            
                                Tutorial or example code for extending c++11 random with generators and distributions
                            
                                Can't decrypt these strings
                            
                                Valgrind is hanging with no output
                            
                                Can I use a C++ class in python class as superclass [duplicate]
                            
                                Convert from an infix expression to postfix (C++) using Stacks
                            
                                Does an iterator iterate through boost::unordered_set or boost::unordered_map in the same order as long as the set's unchanged?
                            
                                close on socket not releasing file descriptor
                            
                                Can decltype declare an r-value?
                            
                                Where did `throw(...)` syntax come from?
                            
                                Is move assignment via destruct+move construct safe?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With