I am Working through the analysis of deterministic median finding under the assumption that the input is divided into 3 parts rather than 5 and the question is Where does it break down?
the deterministic median finding algorithm:
SELECT(i, n)
Divide the n elements into groups of 5. Find the median of each 5-element group by rote.
Recursively SELECT the median x of the ⎣n/5⎦ group medians to be the pivot.
Partition around the pivot x. Let k = rank(x)
4.if i = k then return x
elseif i < k
then recursively SELECT the ith smallest element in the lower part
else recursively SELECT the (i–k)th smallest element in the upper part
I went through the analysis of the algorithm and I believe that Step 1 and 3 will take O(n) where it just takes just constant time to find the median of 5 elements and Step2 takes T(n/5).so at least 3/10 of elements are ≤ p, and at least 3/10 of the array is ≥ p, therefore,Step 4 will T(7n/10) and will get the recurrence. T(n) ≤ cn + T(n/5) + T(7n/10), but when I divide the elements in goroup of 3, let say the 9 elements and I divided them in group such that:
{1,2,10} {4,11,14}, {15,20,22}
I got the medians 2,11,20 and the p=11.
in general in group of five lets say g = n/5 groups and at least ⌈g/2⌉ of them (those groups whose median is ≤ p) at least three of the five elements are ≤ p. so the total number of elements ≤ p is at least 3⌈g/2⌉ ≥ 3n/10. BUT in group of 3 we can get all the three elements might be LESS than p. and here I think the algorithm will break down !!
Did I get the idea correct???
The median-of-medians algorithm is a deterministic linear-time selection algorithm. The algorithm works by dividing a list into sublists and then determines the approximate median in each of the sublists. Then, it takes those medians and puts them into a list and finds the median of that list.
In computer science, the median of medians is an approximate (median) selection algorithm, frequently used to supply a good pivot for an exact selection algorithm, mainly the quickselect, that selects the kth smallest element of an initially unsorted array.
So with groups of 3 elements SELECT does not run in linear time. The reason for this is that during step 5, we are still left with a subproblems of total size n.
The median of medians is not the same as the median of the raw scores. A simple case of this is that when you have an odd number of sales, the median is the middle value; when you have an even number of sales, the median is commonly taken as the average between those two values.
In a group of 3, as for the groups of 5, about half of the groups will have their median element less than the median-of-medians, so in those groups you can discard elements less than their median. In your case, (1,2,10) has its median less than 11, so you can discard 1 and 2.
Where I think things break down for groups of 3 is in the costing. 3(floor(floor(n/5)/2 - 2) which is roughly 3n/10 becomes 2(floor(floor(n/3)/2 -2) or so, which is roughly n/3. This means that 7n/10 becomes 2n/3. floor(n/5) becomes floor(n/3), so instead of 7cn/10 + 2cn/10 = 9cn/10 you are going to get just 2cn/3 + cn/3 = cn, and instead of T(n) <= cn you are going to have something where you will have to look closely at the terms that don't involve c, and you might end up showing that it is not linear after all.
It looks like you actually get to throw away slightly more elements at each stage of the recursion, but the recursion divides the amount of work left by 3, not 5, and this isn't enough to break even.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With