I just read the Wikipedia page about Bucket sort. In this article they say that the worst case complexity is O(n²). But I thought the worst case complexity was O(n + k) where k are the number of buckets. This is how I calculate this complexity:
Am I missing something?
However, in the worst-case scenario, all elements will end up in a single bucket despite our best efforts. And when that single bucket is sorted using insertion sort, it takes O(n2) time to perform it on all n elements. Therefore, the worst-case runtime of Bucket Sort is O(n2).
If insertion sort is used to sort bucket elements, the overall complexity will be linear, i.e. O(n+k). O(n) is the complexity of creating buckets, and O(k) is the complexity of sorting bucket elements using algorithms with linear time complexity in the best case.
Bucket Sort Complexity It makes the complexity depend on the sorting algorithm used to sort the elements of the bucket. The complexity becomes even worse when the elements are in reverse order. If insertion sort is used to sort elements of the bucket, then the time complexity becomes O(n2) .
In order to merge the buckets, they first need to be sorted. Consider the pseudocode given in the Wikipedia article:
function bucketSort(array, n) is
buckets ← new array of n empty lists
for i = 0 to (length(array)-1) do
insert array[i] into buckets[msbits(array[i], k)]
for i = 0 to n - 1 do
nextSort(buckets[i])
return the concatenation of buckets[0], ..., buckets[n-1]
The nextSort(buckets[i])
sorts each of the individual buckets. Generally, a different sort is used to sort the buckets (i.e. insertion sort), as once you get down and size, different, non-recursive sorts often give you better performance.
Now, consider the case where all n
elements end up in the same bucket. If we use insertion sort to sort individual buckets, this could lead to the worst case performance of O(n^2)
. I think the answer must be dependent on the sort you choose to sort the individual buckets.
What if the algorithm decides that every element belongs in the same bucket? In that case, the linked list in that bucket needs to be traversed every time an element is added. That takes 1 step, then 2, then 3, 4, 5... n . Thus the time is the sum of all of the numbers from 1 to n which is (n^2 + n)/2, which is O(n^2).
Of course, this is "worst case" (all the elements in one bucket) - the algorithm to calculate which bucket to place an element is generally designed to avoid this behavior.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With