Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Most common element in an array / Finding the relative majority, deterministically in O(n) time and O(1) space?

So for example, the answer for the array:

1, 11, 3, 95, 23, 8, 1

would be 1, since all the other elements only occur once while 1 occurs twice.

A lot of the questions similar to this question that I've seen on stackoverflow ask to find the absolute majority (the answer occurs at least n/2 in an array of length n), or answer the question using sorting or a hash table. The former is not what I'm asking, and the latter is either too slow ( O(n log n) for sorting ) or uses too much memory ( O(n) for a hash table ).

Does such an algorithm exist? If not, is there a proof showing why it's impossible? Including a source would be nice.

like image 812
weeb Avatar asked Aug 02 '12 16:08

weeb


People also ask

What is a majority element in an array?

The majority element of an array is the element that occurs repeatedly for more than half of the elements of the input. If we have a sequence of numbers then the majority element appears at least times in the sequence.

How do you find the majority element in an array using divide and conquer?

If we know the majority element in the left and right halves of an array, we can determine which is the global majority element in linear time. Here, we apply a classical divide & conquer approach that recurses on the left and right halves of an array until an answer can be trivially achieved for a length-1 array.

How do you find the maximum occurence of a number in an array?

Navigate the array. Update the array as for ith index :- arrA[arrA[i]% n] = arrA[arrA[i]% n] + n; Now navigate the updated array and check which index has the maximum value, that index number is the element which has the maximum occurrence in the array.


1 Answers

If you want to have fixed space to find the most common element you need to have a maximum number of bits for an element. If you didn't, then large input arrays could have larger input numbers such that the bits to represent the number is bigger than your fixed space to store the result.

Suppose k is the length of the largest number you support. If you try to naively create an array of 2^k buckets to count occurrences of each number (counter sort) you could receive an array consisting of the same number, in which case your algorithm would end up needing log(n) space to store the sum.[*]

If we look at a simpler version of the problem - determine whether or not there are more 1's or 0's in an input, I think you need a stack to do this (you store how much 1 or 0 is leading by), and so constant space just isn't possible, even if we limit the input length to k = 1 bit in size.

Your problem is more general (k > 1, but still fixed), and would also need non-constant space, so it's not possible, as the question is worded.

[*] If you assume counters have O(1) space complexity, then you can take the counter sort approach, although by doing so you've placed an upper-bound on the maximum size of your input array (which may or may not be acceptable): In terms of k, the maximum number of bits for an input element of your array and in terms of c the maximum number of bits in your counter your array can have at most 2^k * 2^c elements (one of the counters would overflow otherwise on the next element). To address this, you could add a O(1) time step to decrement your counters so that the minimum value is always 0 after each element is processed if all counters are non-0, thereby making them relative instead of absolute. This takes O(1) time because if all are non-zero you only need to decrement O(2^k) = O(1) counters by 1 if you perform it on each element. While the algorithm can now process some arbitrarily large inputs, any input array that has a sub-array such that two values a and b are such that count(a) - count(b) > 2^c = max(counter) using a counter strategy will fail for some inputs. In fact a consequence of relying on a O(1) space complexity counter approach is that all arrays that start with 2^c + 1 identical elements cannot be handled by this algorithm.

like image 81
Jesus is Lord Avatar answered Nov 03 '22 07:11

Jesus is Lord