Please read this question carefully before closing it as a duplicate, though if it is an honest duplicate I'll be happy to know about it. This is a generalization of Find any one of multiple possible repeated integers in a list.
Given any set S of N distinct integers, and any array A of length N+1 each entry of which is taken from S, what is the best algorithm to find some (there must be at least one) repeated entry of A?
NOTE: There could be multiple repeated entries in A, and any entry could be repeated multiple times.
As Nemo points out, the trivial solution takes O(1) space and O(N^2) time. I'm looking for a solution that improves the time without compromising the space too much. To be precise, the solution(s) I'm looking for:
EDIT: The set S is there to ensure that the array A has at least one repeated entry. For this problem, do not assume that you have S given to you as an ordered set. You can query S (boolean to return true
is s in S and false
otherwise) and you can query A (call for A[i]), but that's all. Any solution that sorts A or S exceeds the space limit.
This generalization invalidates my pointer solution to the original question (which has O(1) space and O(N) time), and the space constraint I'm imposing invalidates fiver's solution (which has O(N) space and time).
Duplicate elements can be found using two loops. The outer loop will iterate through the array from 0 to length of the array. The outer loop will select an element. The inner loop will be used to compare the selected element with the rest of the elements of the array.
Function findRepeat(int arr[],int n) takes an array and its length as input and displays the repeated element value and length of repeated elements. Take the initial count as 0. Starting from index i=0 to i<n.
This algorithm is similar to Justin Simon's, but the key point is how to compute the median (or the kth element) of S using just O(1) space efficiently.
Here is that key algorithm, which is randomized:
Set lower equal to the minimum element of S and upper equal to the maximum element of S. Pick a random element x from S that is between lower and upper (this costs at most O(n) expected time). Compute the rank of x (O(n) time). If x's rank is too low, set lower to the successor of x (O(n) time), else set upper equal to the predecessor of x (O(n) time). Repeat until lower equals upper.
Note that each iteration costs O(n) in expectation and there are O(lg n) iterations in expectation so the expected time cost is O(n lg n) and space usage is O(1) since we only store lower and upper.
Using this ability to select the kth element, we can then use the pigeonhole principle as suggested in the original question to find increasingly small segments of S that contain too many elements to all be distinct, using O(lg n) linear scans of A and O(1) space to store the relevant sums of elements in each region. Each such iteration costs O(n) in addition to the O(n lg n) cost of finding the kth element, and there are O(lg n) iterations, so the total cost is O(n lg^2 n).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With