The following is an interview question that a friend was asked. We're both stumped. The statement is as follows:
Consider an unsorted list of integers,
A
. SupposeA
has sizen
. Suppose, further, that the element at indexi
has valuex
. We define two terms:
A
has "a duplicate in index rangek
" if the valuex
is at any ofA[i-k], A[i-k+1],...,A[i-1],A[i+1],...,A[i+k]
(i.e.k
is the radius of the search across indicies)
A
has "a duplicate in value ranged
" if any element ofA
lies in the rangex-d, x-d+1,...,x,x+1,...,x+d
(i.e.d
is the radius of the search across values)In O(n) time and O(k) space, we want to know:
Does
A
have a duplicate in index rangek
?Does
A
have a duplicate in index rangek
AND value ranged
?
The first one is easy: we walk from index 0
to n-1
and keep a set of the past k
numbers we've seen. If our current number is already in the set before we add it, we have a duplicate. Otherwise we don't, and we remove the oldest element in the set to make room for the new one.
The second one, I have no idea. I can think of a solution in O(nd) time and O(k) space: keep the same set as we did for the first problem, but, when we hit x
, check for every number between x-d
and x+d
. But I can't imagine what a solution that doesn't have d
in either the space or time complexity would look like. What am I missing?
You can do the second part by an extension of your first method.
Instead of storing the raw value in the set, instead use a dictionary and store the value divided by d+1 (mapping to the actual value).
The main point is that we can be sure that no 2 values can lie in the same bin (or they are closer than d apart).
Now when you scan through, you need to check for possible collisions in the bin plus the two neighbouring bins.
So this is O(3n) = O(n) operations and O(k) space.
def check_for_duplicate(A,d,k):
D = {} # Stores the previous 2k entries
for i,a in enumerate(A):
b = a // (d+1)
for delta in [-1,0,1]:
if b+delta in D and abs(D[b+delta] - a) <= d:
return True
old = i - 2 * k
if old >= 0:
del D[ A[old] // (d+1) ]
D[b] = a
return False
print check_for_duplicate(A=[8,14],d=5,k=2)
So it occurs to me, you could just do
part1 = false;
part2 = false;
for (index = i - k; index <= i + k, index++) {
diff = ABS(A[index] - x);
if (diff == 0) { part1 = true; }
if (diff < d) { part2 = true; }
}
(not in python, but the algorithm seems sufficient) Let me know if I'm totally missing the point, it's late here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With