Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does random shuffling in quick sort help in increasing the efficiency of the code?

I was going through lecture videos by Robert Sedgwick on algorithms, and he explains that random shuffling ensures we don't get to encounter the worst case quadratic time scenario in quick sort. But I am unable to understand how.

like image 953
Paritosh Pandey Avatar asked Dec 25 '14 08:12

Paritosh Pandey


People also ask

What is the purpose of using randomized quick sort over standard quick sort?

3. What is the purpose of using randomized quick sort over standard quick sort? Explanation: Randomized quick sort helps in avoiding the worst case time complexity of O(n2) which occurs in case when the input array is already sorted. However the average case and best case time complexities remain unaltered.

How can quick sort improve performance?

A second easy way to improve the performance of quicksort is to use the median of a small sample of items taken from the array as the partitioning item. Doing so will give a slightly better partition but at the cost of computing the median.

What is the efficiency of quick sort?

Quicksort partitions an array and then calls itself recursively twice to sort the two resulting subarrays. This algorithm is quite efficient for large-sized data sets as its average and worst-case complexity are O(n2), respectively.

What happens with a randomized quick sort?

Randomized Quicksort Quicksort sorts an array A by partitioning it into subarrays using a pivot element, and recursively sorting the subarrays. In the randomized (Las Vegas) version, the pivot is chosen at random from the subarray.


2 Answers

It's really an admission that although we often talk about average case complexity, we don't in practice expect every case to turn up with the same probability.

Sorting an already sorted array is worst case in quicksort, because whenever you pick a pivot, you discover that all the elements get placed on the same side of the pivot, so you don't split into two roughly equal halves at all. And often in practice this already sorted case will turn up more often than other cases.

Randomly shuffling the data first is a quick way of ensuring that you really do end up with all cases turning up with equal probability, and therefore that this worst case will be as rare as any other case.

It's worth noting that there are other strategies that deal well with already sorted data, such as choosing the middle element as the pivot.

like image 90
chiastic-security Avatar answered Sep 23 '22 00:09

chiastic-security


The assumption is that the worst case -- everything already sorted -- is frequent enough to be worth worrying about, and a shuffle is a black-magic least-effort sloppy way to avoid that case without having to admit that by improving that case you're moving the problem to another one, which happened to get randomly shuffled into sorted order. Hopefully that bad case is a much rarer situation, and even if it does come up the randomness means the problem can't easily be reproduced and blamed on this cheat.

The concept of improving a common case at the expense of a rare one is fine. The randomness as an alternative to actually thinking about which cases will be more or less common is somewhat sloppy.

like image 28
keshlam Avatar answered Sep 27 '22 00:09

keshlam