Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to find a repeated number in a list that may contain any number of repeats

Please read this question carefully before closing it as a duplicate, though if it is an honest duplicate I'll be happy to know about it. This is a generalization of Find any one of multiple possible repeated integers in a list.

Given any set S of N distinct integers, and any array A of length N+1 each entry of which is taken from S, what is the best algorithm to find some (there must be at least one) repeated entry of A?

NOTE: There could be multiple repeated entries in A, and any entry could be repeated multiple times.

As Nemo points out, the trivial solution takes O(1) space and O(N^2) time. I'm looking for a solution that improves the time without compromising the space too much. To be precise, the solution(s) I'm looking for:

  • Returns a value a that appears in A at least twice,
  • Uses at most O(log N) space without modifying A, and
  • Takes less than O(N^2) time

EDIT: The set S is there to ensure that the array A has at least one repeated entry. For this problem, do not assume that you have S given to you as an ordered set. You can query S (boolean to return true is s in S and false otherwise) and you can query A (call for A[i]), but that's all. Any solution that sorts A or S exceeds the space limit.

This generalization invalidates my pointer solution to the original question (which has O(1) space and O(N) time), and the space constraint I'm imposing invalidates fiver's solution (which has O(N) space and time).

like image 847
PengOne Avatar asked Jun 22 '11 00:06

PengOne


People also ask

How do you find duplicate numbers in an array if it contains multiple duplicates?

Duplicate elements can be found using two loops. The outer loop will iterate through the array from 0 to length of the array. The outer loop will select an element. The inner loop will be used to compare the selected element with the rest of the elements of the array.

How do you find the number of times an element is repeated in an array?

Function findRepeat(int arr[],int n) takes an array and its length as input and displays the repeated element value and length of repeated elements. Take the initial count as 0. Starting from index i=0 to i<n.


1 Answers

This algorithm is similar to Justin Simon's, but the key point is how to compute the median (or the kth element) of S using just O(1) space efficiently.

Here is that key algorithm, which is randomized:

Set lower equal to the minimum element of S and upper equal to the maximum element of S. Pick a random element x from S that is between lower and upper (this costs at most O(n) expected time). Compute the rank of x (O(n) time). If x's rank is too low, set lower to the successor of x (O(n) time), else set upper equal to the predecessor of x (O(n) time). Repeat until lower equals upper.

Note that each iteration costs O(n) in expectation and there are O(lg n) iterations in expectation so the expected time cost is O(n lg n) and space usage is O(1) since we only store lower and upper.

Using this ability to select the kth element, we can then use the pigeonhole principle as suggested in the original question to find increasingly small segments of S that contain too many elements to all be distinct, using O(lg n) linear scans of A and O(1) space to store the relevant sums of elements in each region. Each such iteration costs O(n) in addition to the O(n lg n) cost of finding the kth element, and there are O(lg n) iterations, so the total cost is O(n lg^2 n).

like image 129
jonderry Avatar answered Oct 27 '22 07:10

jonderry