Design and analyze a linear time algorithm to determine whether there exists an element in a list of n elements which repeats itself at least n/10 times in the list.
How can I do this? I'm posting my own idea as an answer.
I assume the elements are comparable.
Perform a selection algorithm for the: n/10th, 2n/10th, ..., 9n/10th, 10(n/10)th smallest elements1
These are your candidates. Check the #occurrences for each of them, and if one of them repeats at least n/10 times the answer is true
. Otherwise, it is false
.
Proof:
If an element appears at least n/10 times, then it must "intersect" with k*n/10
for some k
(in a sorted list)2. Thus, this element will be chosen as a "candidate", and you will later discover (by counting) exactly how many times it appears, and will return true
.
If no element is repeated n/10
times, the algorithm will trivially return false
in the last step of checking each candidate.
Complexity:
Each selection algorithm is O(n)
, and it is done 10 times. Also for each candidate requires linear scan on the list, which is also O(n)
totaling in O(n)
overall—but with terrible constants.
Explanations:
(1) Selection algorithm will find the element which would be in the index n/10, 2n/10,...9n/10 in a sorted list, and the algorithm itself is only O(n)
(2) Let's look at the example of [1,2,..,100,100,..,100]
(11 times 100).
Note that the list is sorted and the element 100 appears in: list[9*n/10]
(index 9*n/10
). The idea of the selection algorithm is, that—even if you shuffle the list—select(list,9*n/10)
will always return the same element—in this case 100—since it is the 9n/10
th element in the sorted list (this is what the algorithm does).
Now, you can see that for each element (let it be e
) that repeats n/10
times, there is some index i
such that in the sorted version of the list, all elements in indices i,i+1,...,i+n/10
will be e
. One of these indices must be one of k*n/10
for some k
(convince yourself why!). Thus, the selection algorithm on k*n/10
will yield e
.
Let me tell you a clever single pass algorithm to find a majority element (one with frequency higher than n/2) and you should see how to adapt it:
best = null
count = 0
foreach x in list:
if (count == 0)
count++
best = x
else if (best == x)
count++
else
count--
return best
If there is a majority element, the above algorithm will find it (one pass, constant space). Once you figure out how it works you will see how it can be adapted to the n/10 case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With