What makes this bucket sort function slow?

Tags:

The function is defined as

void bucketsort(Array& A){
  size_t numBuckets=A.size();
  iarray<List> buckets(numBuckets);

  //put in buckets
  for(size_t i=0;i!=A.size();i++){
    buckets[int(numBuckets*A[i])].push_back(A[i]);
  }

  ////get back from buckets
  //for(size_t i=0,head=0;i!=numBuckets;i++){
  //size_t bucket_size=buckets[i].size();
  //for(size_t j=0;j!=bucket_size;j++){
  //  A[head+j] = buckets[i].front();
  //  buckets[i].pop_front();
  //}
  //head += bucket_size;
  //}
 for(size_t i=0,head=0;i!=numBuckets;i++){
   while(!buckets[i].empty()){
     A[head]          = buckets[i].back();
     buckets[i].pop_back();
     head++;
   }
 }

  //inseration sort
  insertionsort(A);
}

where List is just list<double> in STL.

The content of array are generate randomly in [0,1). Theoretically bucket sort should be faster than quicksort for large size for it's O(n),but it fails as in the following graph.

alt text

I use google-perftools to profile it on a 10000000 double array. It reports as follow

alt text

It seems I should not use STL list,but I wonder why? Which does std_List_node_base_M_hook do? Should I write list class myself?

PS:The experiment and improvement
I have tried just leave the codes of putting in buckets and this explained that most time is used on building up buckets.
The following improvement is made: - Use STL vector as buckets and reserve reasonable space for buckets - Use two helper array to store the information used in building buckets,thus avoiding the use of linked list,as in following code

void bucketsort2(Array& A){
  size_t    numBuckets = ceil(A.size()/1000);
  Array B(A.size());
  IndexArray    head(numBuckets+1,0),offset(numBuckets,0);//extra end of head is used to avoid checking of i == A.size()-1

  for(size_t i=0;i!=A.size();i++){
    head[int(numBuckets*A[i])+1]++;//Note the +1
  }
  for(size_t i=2;i<numBuckets;i++){//head[1] is right already
    head[i] += head[i-1];
  }

  for(size_t i=0;i<A.size();i++){
    size_t  bucket_num         = int(numBuckets*A[i]);
    B[head[bucket_num]+offset[bucket_num]] = A[i];
    offset[bucket_num]++;
  }
  A.swap(B);

  //insertionsort(A);
  for(size_t i=0;i<numBuckets;i++)
    quicksort_range(A,head[i],head[i]+offset[i]);
}

The result in the following graph alt text where line start with list using list as buckets,start with vector using vector as buckets,start 2 using helper arrays.By default insertion sort is used at last and some use quick sort as the bucket size is big.
Note "list" and "list,only put in" ,"vector,reserve 8" and "vector,reserve 2" nearly overlap.
I will try small size with enough memory reserved.

647

asked Oct 17 '10 14:10

luoq

2 Answers

In my opinion, the biggest bottleneck here is memory management functions (such as new and delete).

Quicksort (of which STL probably uses an optimized version) can sort an array in-place, meaning it requires absolutely no heap allocations. That is why it performs so well in practice.

Bucket sort relies on additional working space, which is assumed to be readily available in theory (i.e. memory allocation is assumed to take no time at all). In practice, memory allocation can take anywhere from (large) constant time to linear time in the size of memory requested (Windows, for example, will take time to zero the contents of pages when they are allocated). This means standard linked list implementations are going to suffer, and dominate the running time of your sort.

Try using a custom list implementation that pre-allocates memory for a large number of items, and you should see your sort running much faster.

answered Oct 17 '22 21:10

casablanca

With

iarray<List> buckets(numBuckets);

you are basically creating a LOT of lists and that can cost you a lot especially in memory access which it theoretically linear but that's not the case in practice.

Try to reduce the number of buckets.

To verify my assertion analyse your code speed with only the creation of the lists.

Also to iterate over the elements of the lists you should not use .size() but rather

//get back from buckets
for(size_t i=0,head=0;i!=numBuckets;i++)
  while(!buckets[i].empty())
  {
    A[head++] = buckets[i].front();
    buckets[i].pop_front();
  }

In some implementations .size() can be in O(n). Unlikely but...

After some research I found this page explaining what is the code for std::_List_node_base::hook.

Seems it is only to insert an element at a given place in a list. Shouldn't cost a lot..

answered Oct 17 '22 22:10

Loïc Février

Related questions
                            
                                How to safely convert const char* to const char8_t* in C++20?
                            
                                Problem in GCC/C++17 with template template class
                            
                                Inheriting a templated conversion operator
                            
                                Does this program violate the ODR?
                            
                                std::unique_ptr reset() order of operations
                            
                                calculating user defined formulas (with c++)
                            
                                Good alternative to shared memory IPC for Java/C++ apps on Linux
                            
                                Help me remove a Singleton: looking for an alternative
                            
                                C++ OOP Library for Programming the Lego NXT
                            
                                Best practice for C++ audio capture API under Linux?
                            
                                doxygen C++ inline template documentation
                            
                                Const-Qualification of Main's Parameters in C++
                            
                                How does one convert cdt managed to makefiles?
                            
                                C++0x Lambda to function pointer in VS 2010
                            
                                C++ boost forward declaration question
                            
                                What is the fact of DirectUI? [closed]
                            
                                Using x64 dll in x86 application
                            
                                Is there an equivalent for __if_exists in gnu c++?
                            
                                What are the good alternatives for communication between local C++ and Java programs?
                            
                                Debugging ncurses with Eclipse CDT

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What makes this bucket sort function slow?

Tags:

c++

performance

algorithm

stl

luoq

People also ask

2 Answers

casablanca

Loïc Février

Recent Activity

Donate For Us