I have read this problem Find the most common entry in an array
and the answer from jon skeet is just mind blowing .. :)
Now I am trying to solve this problem find an element which occurs more than n/3 times in an array ..
I am pretty sure that we cannot apply the same method because there can be 2 such elements which will occur more than n/3 times and that gives false alarm of the count ..so is there any way we can tweak around jon skeet's answer to work for this ..?
Or is there any solution that will run in linear time ?
Solution Approachif (arr[i] > max) -> max3 = max2, max2 = max , max = arr[i]. else if (arr[i] > max2) -> max3 = max2, max2 = arr[i]. else if (arr[i] > max3) -> max3 = arr[i]. At the end of the loop, we will print all three values.
If there is no such element in the array, return -1. For Example: If the given array ARR = {3,3,3,1,5,6,3} we can see that here 3 occurs 4 times in the array, which is greater than 7/3(N/3), so the dominant number here is 3.
I use the following Python solution to discuss the correctness of the algorithm:
class Solution:
"""
@param: nums: a list of integers
@return: The majority number that occurs more than 1/3
"""
def majorityNumber(self, nums):
if nums is None:
return None
if len(nums) == 0:
return None
num1 = None
num2 = None
count1 = 0
count2 = 0
# Loop 1
for i, val in enumerate(nums):
if count1 == 0:
num1 = val
count1 = 1
elif val == num1:
count1 += 1
elif count2 == 0:
num2 = val
count2 = 1
elif val == num2:
count2 += 1
else:
count1 -= 1
count2 -= 1
count1 = 0
count2 = 0
for val in nums:
if val == num1:
count1 += 1
elif val == num2:
count2 += 1
if count1 > count2:
return num1
return num2
First, we need to prove claim A:
Claim A: Consider a list C
which contains a majority number m
which occurs more floor(n/3)
times. After 3 different numbers are removed from C
, we have C'
. m
is the majority number of C'
.
Proof: Use R
to denote m
's occurrence count in C
. We have R > floor(n/3)
. R > floor(n/3)
=> R - 1 > floor(n/3) - 1
=> R - 1 > floor((n-3)/3)
. Use R'
to denote m
's occurrence count in C'
. And use n'
to denote the length of C'
. Since 3 different numbers are removed, we have R' >= R - 1
. And n'=n-3
is obvious. We can have R' > floor(n'/3)
from R - 1 > floor((n-3)/3)
. So m
is the majority number of C'
.
Now let's prove the correctness of the loop 1
. Define L
as count1 * [num1] + count2 * [num2] + nums[i:]
. Use m
to denote the majority number.
Invariant
The majority number m
is in L
.
Initialization
At the start of the first itearation, L
is nums[0:]
. So the invariant is trivially true.
Maintenance
if count1 == 0
branch: Before the iteration, L
is count2 * [num2] + nums[i:]
. After the iteration, L
is 1 * [nums[i]] + count2 * [num2] + nums[i+1:]
. In other words, L
is not changed. So the invariant is maintained.
if val == num1
branch: Before the iteration, L
is count1 * [nums[i]] + count2 * [num2] + nums[i:]
. After the iteration, L
is (count1+1) * [num[i]] + count2 * [num2] + nums[i+1:]
. In other words, L
is not changed. So the invariant is maintained.
f count2 == 0
branch: Similar to condition 1.elif val == num2
branch: Similar to condition 2.else
branch: nums[i]
, num1
and num2
are different to each other in this case. After the iteration, L
is (count1-1) * [num1] + (count2-1) * [num2] + nums[i+1:]
. In other words, three different numbers are moved from count1 * [num1] + count2 * [num2] + nums[i:]
. From claim A, we know m
is the majority number of L
.So the invariant is maintained.Termination
When the loop terminates, nums[n:]
is empty. L
is count1 * [num1] + count2 * [num2]
.
So when the loop terminates, the majority number is either num1
or num2
.
Jan Dvorak's answer is probably best:
At the end, make a second pass over the array to check whether the candidates really do have the required count. This isn't allowed by the question you link to but I don't see how to avoid it for this modified version. If there is a value that occurs more than n/3 times then it will be in a slot, but you don't know which one it is.
If this modified version of the question guaranteed that there were two values with more than n/3
elements (in general, k-1
values with more than n/k
) then we wouldn't need the second pass. But when the original question has k=2
and 1 guaranteed majority there's no way to know whether we "should" generalize it as guaranteeing 1 such element or guaranteeing k-1
. The stronger the guarantee, the easier the problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With