strangely slow quicksort for large tables

Tags:

I have been doing my homework which is to compare a bunch of sorting algorithms, and I have came across a strange phenomenon. Things have been as expected: insertionsort winning for something like table of 20 ints, otherwise quicksort outperforming heapsort and mergesort. Up to a table of 500,000 ints (stored in memory). For 5,000,000 ints (still stored in memory) quicksort becomes suddenly worse then heapsort and mergesort. Numbers are always uniformly distributed random, windows virtual memory turned off. Anyone has an idea what could be the cause of that?

     void quicksortit(T *tab,int s) {
                   if (s==0 || s==1) return;
                   T tmp;
                   if (s==2) {
                      if (tab[0]>tab[1]) {
                                         tmp=tab[0];
                                         tab[0]=tab[1];
                                         tab[1]=tmp;
                                         }
                      return;
                      }
                   T pivot=tab[s-1];
                   T *f1,*f2;
                   f1=f2=tab;
                   for(int i=0;i<s;i++)
                           if (*f2>pivot)
                              f2++;
                           else {
                                tmp=*f1;
                                *f1=*f2;
                                *f2=tmp;
                                f1++; f2++;
                                }
                   quicksortit(tab,(f1-1)-tab);
                   quicksortit(f1,f2-f1);
     };

317

asked Nov 03 '14 20:11

Domin

2 Answers

You algorithm starts failing when there are many duplicates in the array. You only noticed this at large values because you have been feeding the algorithm random values which have a large span
( I'm assuming you used rand() with: 0 - RAND_MAX ), and that problem only appears with large arrays.

When you try to sort an array of identical numbers( try sorting 100000 identical numbers, the program will crash ) you will first walk through the entire array superfluously swapping elements. Then you split the array into two, but the large array has only been reduced by 1:

                    v
quicksortit(tab,(f1-1)-tab);

Thus your algorithm becomes O(n^2), and you also consume a very large amount of stack. Searching for a better pivot, will not help you in this case, rather choose a version of quicksort() that doesn't exhibit this flaw.

For example:

function quicksort(array)
    if length(array) > 1
        pivot := select middle, or a median of first, last and middle
        left := first index of array
        right := last index of array
        while left <= right
            while array[left] < pivot
                left := left + 1
            while array[right] > pivot
                right := right - 1
            if left <= right
                swap array[left] with array[right]
                left := left + 1
                right := right - 1
        quicksort(array from first index to right)
        quicksort(array from left to last index)

Which is a modified version of: http://rosettacode.org/wiki/Sorting_algorithms/Quicksort

answered Sep 18 '22 00:09

2501

It could be that your array is now bigger than the L3 cache.

Quicksort partitioning operation moves random elements from one end of the array to another. A typical intel L3 cache is like 8MB. With 5M 4-byte elements - your array is 20MB. and you're writing from one end of it to the other.

Cache misses out of L3 go to main memory and can be much slower than higher level cache misses.

That is up until now your entire sorting operation was operating completely inside the CPU.

answered Sep 18 '22 00:09

Rafael Baptista

Related questions
                            
                                Boost::format hex output
                            
                                Share a variable between main and thread
                            
                                c++ unique pointer: memory leak
                            
                                OpenCV: How to calculate max(R,G,B)
                            
                                difference between x*= and x=x*...?
                            
                                Implementing undo redo in painter program [closed]
                            
                                Need a faster way to convert a cv::Mat into 1 dimensional vector form
                            
                                Where does Clang stores the difference between int32_t *p and int32_t p[10]?
                            
                                How to safely destruct a QThread?
                            
                                Qt 4.8 Signals/Slots not called after moveToThread()
                            
                                Win32. Enable visual styles in dll
                            
                                fatal error C1047 in release build only
                            
                                How do I edit random memory? [closed]
                            
                                Redefinition of int right error using winbgim library
                            
                                Mutex granularity
                            
                                Compiler error C4244: 'initializing' : conversion from '__int64' to 'int', possible loss of data
                            
                                Is there a convenient way to get unique_ptr to automagically have a deleter like shared_ptr?
                            
                                Boost fusion sequence type and name identification for structs and class
                            
                                Is variable allocated on stack if i return before its declaration?
                            
                                How to use boost::preprocessor to unzip a sequence?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

strangely slow quicksort for large tables

Tags:

c++

large-data

quicksort

Domin

People also ask

2 Answers

2501

Rafael Baptista

Recent Activity

Donate For Us