Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determining if an array has a k-majority element

Tags:

algorithm

Suppose that, given an n-element multiset A (not sorted), we want an O(n) time algorithm for determining whether A contains a majority element, i.e., an element that occurs more than n/2 times in A. It is easy to solve this in O(n) time by using the linear-time selection algorithm by finding the median (call it x), then counting how many times x occurs in A and returning it as the majority if the count exceeds n/2 (otherwise the answer is “there is no majority”). Now consider the following generalization of the problem: Given A and an integer k < n, we want an algorithm that determines whether A contains a value that occurs more than n/k times in it (if many such values exist, then it is enough to find one of them). Design an algorithm for doing this, and analyze its complexity as a function of n and k. Your grade on this question will depend on how fast your algorithm is (of course it also has to be correct). Partial credit of 10 points is given for an O(kn) time algorithm, full credit is for an O(n log k) time algorithm.

now I have come up with 2 solutions for the problem but neither fully satisfy the O(n log k) requirement. immediately i saw that i could sort the array using a O(n log n) algorithm then go through and see if any elements repeat more than n/k times linearly but that is O(n log n) not O(n log k)

I also have found and somewhat understood a O(nk) methood done by making an array of the same data type as the input and an int that is k long. then putting each element into an empty element incrementing its counter or if it matches one element in there incrementing its counter till we reach the k+1th unique element at which point you decrement all the counters by 1 till one reaches 0 at which point it is considered empty and the new element can be placed in it. and so on till the end of the input array. then checking all the elements left after we are done to see if they occur more than n/k times. but since this involves checking the n original elements against all k of the new arrays elements it is O(nk). any hints on how to do this problem in O(n log k)? I think the O(nk) algorithm is along the lines of how he wants us to think but I'm not sure where to go from here.

like image 606
user1623709 Avatar asked Aug 24 '12 21:08

user1623709


Video Answer


1 Answers

The method that you described just needs to be used recursively.

Remembering that select moves the elements that are less or equal to the median to the left of the median.

If A is of size n.

Find the median of A. Now find the median of each of the two sub multi-sets of length n/2 that were partitioned by the median. Find the median of each of the four sub multi-sets of length n/4 that were partitioned by the medians. Continue recursively until the leaves are of length n/k. Now the height of the recursive tree is O(lgk). On each level of the recursive tree, there are O(n) operations. If there exist a value that is repeated at least n/k times then it will be in one of these k with length of n/k sub multi-sets. The last operations is also done in O(n). So you get the requested running time of O(nlgk).

like image 117
Avi Cohen Avatar answered Sep 21 '22 07:09

Avi Cohen