Algorithm to find a repeated number in a list that may contain any number of repeats

Tags:

Please read this question carefully before closing it as a duplicate, though if it is an honest duplicate I'll be happy to know about it. This is a generalization of Find any one of multiple possible repeated integers in a list.

Given any set S of N distinct integers, and any array A of length N+1 each entry of which is taken from S, what is the best algorithm to find some (there must be at least one) repeated entry of A?

NOTE: There could be multiple repeated entries in A, and any entry could be repeated multiple times.

As Nemo points out, the trivial solution takes O(1) space and O(N^2) time. I'm looking for a solution that improves the time without compromising the space too much. To be precise, the solution(s) I'm looking for:

Returns a value a that appears in A at least twice,
Uses at most O(log N) space without modifying A, and
Takes less than O(N^2) time

EDIT: The set S is there to ensure that the array A has at least one repeated entry. For this problem, do not assume that you have S given to you as an ordered set. You can query S (boolean to return true is s in S and false otherwise) and you can query A (call for A[i]), but that's all. Any solution that sorts A or S exceeds the space limit.

This generalization invalidates my pointer solution to the original question (which has O(1) space and O(N) time), and the space constraint I'm imposing invalidates fiver's solution (which has O(N) space and time).

847

asked Jun 22 '11 00:06

PengOne

1 Answers

This algorithm is similar to Justin Simon's, but the key point is how to compute the median (or the kth element) of S using just O(1) space efficiently.

Here is that key algorithm, which is randomized:

Set lower equal to the minimum element of S and upper equal to the maximum element of S. Pick a random element x from S that is between lower and upper (this costs at most O(n) expected time). Compute the rank of x (O(n) time). If x's rank is too low, set lower to the successor of x (O(n) time), else set upper equal to the predecessor of x (O(n) time). Repeat until lower equals upper.

Note that each iteration costs O(n) in expectation and there are O(lg n) iterations in expectation so the expected time cost is O(n lg n) and space usage is O(1) since we only store lower and upper.

Using this ability to select the kth element, we can then use the pigeonhole principle as suggested in the original question to find increasingly small segments of S that contain too many elements to all be distinct, using O(lg n) linear scans of A and O(1) space to store the relevant sums of elements in each region. Each such iteration costs O(n) in addition to the O(n lg n) cost of finding the kth element, and there are O(lg n) iterations, so the total cost is O(n lg^2 n).

129

answered Oct 27 '22 07:10

jonderry

Related questions
                            
                                Algorithm to evenly distribute values into containers?
                            
                                How does the Franceschini method work?
                            
                                Find the sum of least common multiples of all subsets of a given set
                            
                                How can you parallelize a regex search of one long string? [duplicate]
                            
                                How do URL shortener calculate the URL key? How do they work?
                            
                                Does anyone know what google news uses for their clustering algorithm?
                            
                                How does the banking transactions work "under the hood" - possibly in detail
                            
                                Game on the tree, cutting branch
                            
                                What is the fastest way to calculate frequency distribution for array in C#?
                            
                                Fisher Yates variation
                            
                                algorithm - Sort an array with LogLogN distinct elements
                            
                                Why does Big-O Notation use O(1) instead of O(k)?
                            
                                Removing minimum no of edges to disconnect two vertices in a graph
                            
                                Dictionary using Red-Black tree - deletion error
                            
                                Algorithm for merging spatially close paths / line segments
                            
                                What does `(i & (i + 1)) - 1` mean? (in Fenwick Trees)
                            
                                Fast calculation of floating 1/N if factorization of very large integer N is known
                            
                                looking for a tuple matching algorithm
                            
                                Partitioning big rectangle to small ones (2D Packing)
                            
                                generating an sequential five digit alphanumerical ID

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Algorithm to find a repeated number in a list that may contain any number of repeats

Tags:

algorithm

search

PengOne

People also ask

1 Answers

jonderry

Recent Activity

Donate For Us