I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code I'm using is shown below: <pre class="prettyprint"><code>void QuickSort(int a[],int start,int end) { int q=HoarePartition(a,start,end); if (end<=start) return; QuickSort(a,q+1,end); QuickSort(a,start,q); } int HoarePartition (int a[],int p, int r) { int x=a[p],i=p-1,j=r; while (1) { do j--; while (a[j] > x); do i++; while (a[i] < x); if (i < j) swap(&a[i],&a[j]); else return j; } } </code></pre> Also, I don't really get why <code>HoarePartition</code> works. Can someone explain why it works, or at least link me to an article that does? I have seen a step-by-step work-through of the partitioning algorithm, but I don't have an intuitive feel for it. In my code, it doesn't even seem to work. For example, given the array <pre class="prettyprint"><code>13 19 9 5 12 8 7 4 11 2 6 21 </code></pre> It will use pivot 13, but end up with the array <pre class="prettyprint"><code> 6 2 9 5 12 8 7 4 11 19 13 21 </code></pre> And will return <code>j</code> which is <code>a[j] = 11</code>. I thought it was supposed to be true that the array starting at that point and going forward should have values that are all larger than the pivot, but that isn't true here because 11 < 13. Here's pseudocode for Hoare partitioning (from CLRS, second edition), in case this is useful: <pre class="prettyprint"><code>Hoare-Partition (A, p, r) x ← A[p] i ← p − 1 j ← r + 1 while TRUE repeat j ← j − 1 until A[j] ≤ x repeat i ← i + 1 until A[i] ≥ x if i < j exchange A[i] ↔ A[j] else return j </code></pre> Thanks! EDIT: The right C code for this problem will end up being: <pre class="prettyprint"><code>void QuickSort(int a[],int start,int end) { int q; if (end-start<2) return; q=HoarePartition(a,start,end); QuickSort(a,start,q); QuickSort(a,q,end); } int HoarePartition (int a[],int p, int r) { int x=a[p],i=p-1,j=r; while (1) { do j--; while (a[j] > x); do i++; while (a[i] < x); if (i < j) swap(&a[i],&a[j]); else return j+1; } } </code></pre>

To answer the question of "Why does Hoare partitioning work?": Let's simplify the values in the array to just three kinds: L values (those less than the pivot value), E values (those equal to the pivot value), and G value (those larger than the pivot value). We'll also give a special name to one location in the array; we'll call this location s, and it's the location where the j pointer is when the procedure finishes. Do we know ahead of time which location s is? No, but we know that some location will meet that description. With these terms, we can express the goal of the partitioning procedure in slightly different terms: it is to split a single array into two smaller sub-arrays which are not mis-sorted with respect to each other. That "not mis-sorted" requirement is satisfied if the following conditions are true: <ol> <li>The "low" sub-array, that goes from the left end of the array up to and includes s, contains no G values.</li> <li>The "high" sub-array, that starts immediately after s and continues to the right end, contains no L values.</li> </ol> That's really all we need to do. We don't even need to worry where the E values wind up on any given pass. As long as each pass gets the sub-arrays right with respect to each other, later passes will take care of any disorder that exists inside any sub-array. So now let's address the question from the other side: how does the partitioning procedure ensure that there are no G values in s or to the left of it, and no L values to the right of s? Well, "the set of values to the right of s" is the same as "the set of cells the j pointer moves over before it reaches s". And "the set of values to the left of and including s" is the same as "the set of values that the i pointer moves over before j reaches s". That means that any values which are misplaced will, on some iteration of the loop, be under one of our two pointers. (For convenience, let's say it's the j pointer pointing at a L value, though it works exactly the same for the i pointer pointing at a G value.) Where will the i pointer be, when the j pointer is on a misplaced value? We know it will be: <ol> <li>at a location in the "low" subarray, where the L value can go with no problems;</li> <li>pointing at a value that's either an E or a G value, which can easily replace the L value under the j pointer. (If it wasn't on an E or a G value, it wouldn't have stopped there.)</li> </ol> Note that sometimes the i and j pointer will actually both stop on E values. When this happens, the values will be switched, even though there's no need for it. This doesn't do any harm, though; we said before that the placement of the E values can't cause mis-sorting between the sub-arrays. So, to sum up, Hoare partitioning works because: <ol> <li>It separates an array into smaller sub-arrays which are not mis-sorted relative to each other;</li> <li>If you keep doing that and recursively sorting the sub-arrays, eventually there will be nothing left of the array that's unsorted.</li> </ol>

I believe that there are two problems with this code. For starters, in your Quicksort function, I think you want to reorder the lines <pre class="prettyprint"><code> int q=HoarePartition(a,start,end); if (end<=start) return; </code></pre> so that you have them like this: <pre class="prettyprint"><code> if (end<=start) return; int q=HoarePartition(a,start,end); </code></pre> However, you should do even more than this; in particular this should read <pre class="prettyprint"><code> if (end - start < 2) return; int q=HoarePartition(a,start,end); </code></pre> The reason for this is that the Hoare partition fails to work correctly if the range you're trying to partition has size zero or one. In my edition of CLRS this isn't mentioned anywhere; I had to go to the book's errata page to find this. This is almost certainly the cause of the problem you encountered with the "access out of range" error, since with that invariant broken you might run right off the array! As for an analysis of Hoare partitioning, I would suggest starting off by just tracing through it by hand. There's also a more detailed analysis here. Intuitively, it works by growing two ranges from the ends of the range toward one another - one on the left-hand side containing elements smaller than the pivot and one on the right-hand side containing elements larger than the pivot. This can be slightly modified to produce the Bentley-McIlroy partitioning algorithm (referenced in the link) that scales nicely to handle equal keys. Hope this helps!

QuickSort and Hoare Partition

Q: Which type of partitioning is used in Quicksort?

Hoare Partitioning. Hoare partitioning was proposed by Tony Hoare when the Quicksort algorithm was originally published. Instead of working across the array from low to high, it iterates from both ends at once towards the center.

Q: Is Hoare partition stable?

O(n log(n)). Like others, Hoare's partitioning doesn't produce a stable sort.

Q: Why Quicksort is unstable?

QuickSort is an unstable algorithm because we do swapping of elements according to pivot's position (without considering their original positions).

Q: When should I use Quicksort?

The sorting algorithm is used for information searching and as Quicksort is the fastest algorithm so it is widely used as a better way of searching. It is used everywhere where a stable sort is not needed. Quicksort is a cache-friendly algorithm as it has a good locality of reference when used for arrays.

Tags:

c

algorithm

sorting

quicksort

data-partitioning

I have a hard time translating QuickSort with Hoare partitioning into C code, and can't find out why. The code I'm using is shown below:

void QuickSort(int a[],int start,int end) {
    int q=HoarePartition(a,start,end);
    if (end<=start) return;
    QuickSort(a,q+1,end);
    QuickSort(a,start,q);
}

int HoarePartition (int a[],int p, int r) {
    int x=a[p],i=p-1,j=r;
    while (1) {
        do  j--; while (a[j] > x);
        do  i++; while (a[i] < x);

        if  (i < j)
            swap(&a[i],&a[j]);
        else
            return j;
    }
}

Also, I don't really get why HoarePartition works. Can someone explain why it works, or at least link me to an article that does?

I have seen a step-by-step work-through of the partitioning algorithm, but I don't have an intuitive feel for it. In my code, it doesn't even seem to work. For example, given the array

13 19  9  5 12  8  7  4 11  2  6 21

It will use pivot 13, but end up with the array

 6  2  9  5 12  8  7  4 11 19 13 21

And will return j which is a[j] = 11. I thought it was supposed to be true that the array starting at that point and going forward should have values that are all larger than the pivot, but that isn't true here because 11 < 13.

Here's pseudocode for Hoare partitioning (from CLRS, second edition), in case this is useful:

Hoare-Partition (A, p, r)
    x ← A[p]
    i ← p − 1
    j ← r + 1
    while  TRUE
        repeat   j ←  j − 1
            until     A[j] ≤ x
        repeat   i ←  i + 1
            until     A[i] ≥ x
        if  i < j
            exchange  A[i] ↔ A[j]
        else  return   j

Thanks!

EDIT:

The right C code for this problem will end up being:

void QuickSort(int a[],int start,int end) {
    int q;
    if (end-start<2) return;
    q=HoarePartition(a,start,end);
    QuickSort(a,start,q);
    QuickSort(a,q,end);
}

int HoarePartition (int a[],int p, int r) {
    int x=a[p],i=p-1,j=r;
    while (1) {
        do  j--; while (a[j] > x);
        do  i++; while (a[i] < x);
        if  (i < j) 
            swap(&a[i],&a[j]);
        else 
            return j+1;
    }
}

950

asked Aug 25 '11 22:08

Ofek Ron

2 Answers

To answer the question of "Why does Hoare partitioning work?":

Let's simplify the values in the array to just three kinds: L values (those less than the pivot value), E values (those equal to the pivot value), and G value (those larger than the pivot value).

We'll also give a special name to one location in the array; we'll call this location s, and it's the location where the j pointer is when the procedure finishes. Do we know ahead of time which location s is? No, but we know that some location will meet that description.

With these terms, we can express the goal of the partitioning procedure in slightly different terms: it is to split a single array into two smaller sub-arrays which are not mis-sorted with respect to each other. That "not mis-sorted" requirement is satisfied if the following conditions are true:

The "low" sub-array, that goes from the left end of the array up to and includes s, contains no G values.
The "high" sub-array, that starts immediately after s and continues to the right end, contains no L values.

That's really all we need to do. We don't even need to worry where the E values wind up on any given pass. As long as each pass gets the sub-arrays right with respect to each other, later passes will take care of any disorder that exists inside any sub-array.

So now let's address the question from the other side: how does the partitioning procedure ensure that there are no G values in s or to the left of it, and no L values to the right of s?

Well, "the set of values to the right of s" is the same as "the set of cells the j pointer moves over before it reaches s". And "the set of values to the left of and including s" is the same as "the set of values that the i pointer moves over before j reaches s".

That means that any values which are misplaced will, on some iteration of the loop, be under one of our two pointers. (For convenience, let's say it's the j pointer pointing at a L value, though it works exactly the same for the i pointer pointing at a G value.) Where will the i pointer be, when the j pointer is on a misplaced value? We know it will be:

at a location in the "low" subarray, where the L value can go with no problems;
pointing at a value that's either an E or a G value, which can easily replace the L value under the j pointer. (If it wasn't on an E or a G value, it wouldn't have stopped there.)

Note that sometimes the i and j pointer will actually both stop on E values. When this happens, the values will be switched, even though there's no need for it. This doesn't do any harm, though; we said before that the placement of the E values can't cause mis-sorting between the sub-arrays.

So, to sum up, Hoare partitioning works because:

It separates an array into smaller sub-arrays which are not mis-sorted relative to each other;
If you keep doing that and recursively sorting the sub-arrays, eventually there will be nothing left of the array that's unsorted.

199

answered Oct 06 '22 13:10

afeldspar

I believe that there are two problems with this code. For starters, in your Quicksort function, I think you want to reorder the lines

 int q=HoarePartition(a,start,end);
 if (end<=start) return;

so that you have them like this:

 if (end<=start) return;
 int q=HoarePartition(a,start,end);

However, you should do even more than this; in particular this should read

 if (end - start < 2) return;
 int q=HoarePartition(a,start,end);

The reason for this is that the Hoare partition fails to work correctly if the range you're trying to partition has size zero or one. In my edition of CLRS this isn't mentioned anywhere; I had to go to the book's errata page to find this. This is almost certainly the cause of the problem you encountered with the "access out of range" error, since with that invariant broken you might run right off the array!

As for an analysis of Hoare partitioning, I would suggest starting off by just tracing through it by hand. There's also a more detailed analysis here. Intuitively, it works by growing two ranges from the ends of the range toward one another - one on the left-hand side containing elements smaller than the pivot and one on the right-hand side containing elements larger than the pivot. This can be slightly modified to produce the Bentley-McIlroy partitioning algorithm (referenced in the link) that scales nicely to handle equal keys.

Hope this helps!

answered Oct 06 '22 13:10

templatetypedef

Related questions
                            
                                Is it possible to build a interactive C shell?
                            
                                Would making plain int 64-bit break a lot of reasonable code?
                            
                                How do I get tcsetpgrp() to work in C?
                            
                                Getting Segmentation Fault
                            
                                How to write a hello world kernel?
                            
                                Is it good programming practice to use setjmp and longjmp in C?
                            
                                Valgrind complains with "Invalid write of size 8"
                            
                                GCC SSE code optimization
                            
                                How to convert a 128-bit integer to a decimal ascii string in C?
                            
                                Proper usage of volatile sig_atomic_t
                            
                                printf rounding behavior for doubles
                            
                                How do modern VMs handle memory allocation?
                            
                                Why use `sigsetjmp` instead of `setjmp` function in C?
                            
                                How to get environment of a program while debugging it in GDB
                            
                                fork and existing threads?
                            
                                How does one do Raw IO on Mac OS X? (ie. equivalent to Linux's O_DIRECT flag)
                            
                                How do i remove a signal handler
                            
                                scalar object requires one element in initializer
                            
                                Redirecting STDIN, STDOUT, STDERR to /dev/null in C
                            
                                Set pipe buffer size

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With