Find an element in an array, but the element can jump

Tags:

There is an array where all but one of the cells are 0, and we want to find the index of that single non-zero cell. The problem is, every time that you check for a cell in this array, that non-zero element will do one of the following:

move forward by 1
move backward by 1
stay where it is.

For example, if that element is currently at position 10, and I check what is in arr[5], then the element may be at position 9, 10 or 11 after I checked arr[5].

We only need to find the position where the element is currently at, not where it started at (which is impossible).

The hard part is, if we write a for loop, there really is no way to know if the element is currently in front of you, or behind you.

Some more context if it helps:

The interviewer did give a hint which is maybe I should move my pointer back after checking x-number of cells. The problem is, when should I move back, and by how many slots?
While "thinking out loud", I started saying a bunch of common approaches hoping that something would hit. When I said recursion, the interviewer did say "recursion is a good start". I don't know recursion really is the right approach, because I don't see how I can do recursion and #1 at the same time.
The interviewer said this problem can't be solved in O(n^2). So we are looking at at least O(n^3), or maybe even exponential.

335

asked Mar 22 '16 08:03

Kent

Video Answer

1 Answers

Tl;dr: Your best bet is to keep checking each even index in the array in turn, wrapping around as many times as necessary until you find your target. On average you will stumble upon your target in the middle of your second pass.

First off, as many have already said, it is indeed impossible to ensure you will find your target element in any given amount of time. If the element knows where your next sample will be, it can always place itself somewhere else just in time. The best you can do is to sample the array in a way that minimizes the expected number of accesses - and because after each sample you learn nothing except if you were successful or not and a success means you stop sampling, an optimal strategy can be described simply as a sequence of indexes that should be checked, dependent only on the size of the array you're looking through. We can test each strategy in turn via automated means to see how well they perform. The results will depend on the specifics of the problem, so let's make some assumptions:

The question doesn't specify the starting position our target. Let us assume that the starting position is chosen uniformly from across the entire array.
The question doesn't specify the probability our target moves. For simplicity let's say it's independent on parameters such as the current position in the array, time passed and the history of samples. Using the probability 1/3 for each option gives us the least information, so let's use that.
Let us test our algorithms on an array of ~~100~~ 101 elements. Also, let us test each algorithm one million times, just to be reasonably sure about its average case behavior.

The algorithms I've tested are:

Random sampling: after each attempt we forget where we were looking and choose an entirely new index at random. Each sample has an independent 1/n chance of succeeding, so we expect to take n samples on average. This is our control.
Sweep: try each position in sequence until our target is found. If our target wasn't moving, this would take n/2 samples on average. Our target is moving, however, so we may miss it on our first sweep.
Slow sweep: the same, except we test each position several times before moving on. Proposed by Patrick Trentin with a slowdown factor of 30x, tested with a slowdown factor of 2x.
Fast sweep: the opposite of slow sweep. After the first sample we skip (k-1) cells before testing the next one. The first pass starts at ary[0], the next at ary[1] and so on. Tested with each speed up factor (k) from 2 to 5.
Left-right sweep: First we check each index in turn from left to right, then each index from right to left. This algorithm would be guaranteed to find our target if it was always moving (which it isn't).
Smart greedy: Proposed by Aziuth. The idea behind this algorithm is that we track each cell probability of holding our target, then always sampling the cell with the highest probability. On one hand, this algorithm is relatively complex, on the other hand it sounds like it should give us the optimal results.

Results:

The results are shown as [average] ± [standard derivation].

Random sampling: 100.889145 ± 100.318212

At this point I have realised a fencepost error in my code. Good thing we have a control sample. This also establishes that we have in the ballpark of two or three digits of useful precision (sqrt #samples), which is in line with other tests of this type.
Sweep: 100.327030 ± 91.210692

The chance of our target squeezing through the net well counteracts the effect of the target taking n/2 time on average to reach the net. The algorithm doesn't really fare any better than a random sample on average, but it's more consistent in its performance and it isn't hard to implement either.
slow sweep (x0.5): 128.272588 ± 99.003681

While the slow movement of our net means our target will probably get caught in the net during the first sweep and won't need a second sweep, it also means the first sweep takes twice as long. All in all, relying on the target moving onto us seems a little inefficient.
fast sweep x2: 75.981733 ± 72.620600
fast sweep x3: 84.576265 ± 83.117648
fast sweep x4: 88.811068 ± 87.676049
fast sweep x5: 91.264716 ± 90.337139

That's... a little surprising at first. While skipping every other step means we complete each lap in twice as many turns, each lap also has a reduced chance of actually encountering the target. A nicer view is to compare Sweep and FastSweep in broom-space: rotate each sample so that the index being sampled is always at 0 and the target drifts towards the left a bit faster. In Sweep, the target moves at 0, 1 or 2 speed each step. A quick parallel with the Fibonacci base tells us that the target should hit the broom/net around 62% of the time. If it misses, it takes another 100 turns to come back. In FastSweep, the target moves at 1, 2 or 3 speed each step meaning it misses more often, but it also takes half as much time to retry. Since the retry time drops more than the hit rate, it is advantageous to use FastSweep over Sweep.
Left-right sweep: 100.572156 ± 91.503060

Mostly acts like an ordinary sweep, and its score and standard derivation reflect that. Not too surprising a result.
Aziuth's smart greedy: 87.982552 ± 85.649941

At this point I have to admit a fault in my code: this algorithm is heavily dependent on its initial behavior (which is unspecified by Aziuth and was chosen to be randomised in my tests). But performance concerns meant that this algorithm will always choose the same randomized order each time. The results are then characteristic of that randomisation rather than of the algorithm as a whole.

Always picking the most likely spot should find our target as fast as possible, right? Unfortunately, this complex algorithm barely competes with Sweep 3x. Why? I realise this is just speculation, but let us peek at the sequence Smart Greedy actually generates: During the first pass, each cell has equal probability of containing the target, so the algorithm has to choose. If it chooses randomly, it could pick up in the ballpark of 20% of cells before the dips in probability reach all of them. Afterwards the landscape is mostly smooth where the array hasn't been sampled recently, so the algorithm eventually stops sweeping and starts jumping around randomly. The real problem is that the algorithm is too greedy and doesn't really care about herding the target so it could pick at the target more easily.

Nevertheless, this complex algorithm does fare better than both simple Sweep and a random sampler. it still can't, however, compete with the simplicity and surprising efficiency of FastSweep. Repeated tests have shown that the initial randomisation could swing the efficiency anywhere between 80% run time (20% speedup) and 90% run time (10% speedup).

Finally, here's the code that was used to generate the results:

class WalkSim
  attr_reader :limit, :current, :time, :p_stay
  def initialize limit, p_stay
    @p_stay = p_stay
    @limit = limit
    @current = rand (limit + 1)
    @time = 0
  end

  def poke n
    r = n == @current
    @current += (rand(2) == 1 ? 1 : -1) if rand > @p_stay
    @current = [0, @current, @limit].sort[1]
    @time += 1
    r
  end

  def WalkSim.bench limit, p_stay, runs
    histogram = Hash.new{0}
    runs.times do
      sim = WalkSim.new limit, p_stay
      gen = yield
      nil until sim.poke gen.next
      histogram[sim.time] += 1
    end
    histogram.to_a.sort
  end
end

class Array; def sum; reduce 0, :+; end; end

def stats histogram
  count = histogram.map{|k,v|v}.sum.to_f
  avg = histogram.map{|k,v|k*v}.sum / count
  variance = histogram.map{|k,v|(k-avg)**2*v}.sum / (count - 1)
  {avg: avg, stddev: variance ** 0.5}
end

RUNS = 1_000_000
PSTAY = 1.0/3
LIMIT = 100

puts "random sampling"
p stats WalkSim.bench(LIMIT, PSTAY, RUNS) { 
  Enumerator.new {|y|loop{y.yield rand (LIMIT + 1)}}
}

puts "sweep"
p stats WalkSim.bench(LIMIT, PSTAY, RUNS) { 
  Enumerator.new {|y|loop{0.upto(LIMIT){|i|y.yield i}}}
}

puts "x0.5 speed sweep"
p stats WalkSim.bench(LIMIT, PSTAY, RUNS) { 
  Enumerator.new {|y|loop{0.upto(LIMIT){|i|2.times{y.yield i}}}}
}
(2..5).each do |speed|
  puts "x#{speed} speed sweep"
  p stats WalkSim.bench(LIMIT, PSTAY, RUNS) { 
    Enumerator.new {|y|loop{speed.times{|off|off.step(LIMIT, speed){|i|y.yield i}}}}
  }
end
puts "sweep LR"
p stats WalkSim.bench(LIMIT, PSTAY, RUNS) { 
  Enumerator.new {|y|loop{
    0.upto(LIMIT){|i|y.yield i}
    LIMIT.downto(0){|i|y.yield i}
  }}
}

$sg_gen = Enumerator.new do |y|
  probs = Array.new(LIMIT + 1){1.0 / (LIMIT + 1)}
  loop do
    ix = probs.each_with_index.map{|v,i|[v,rand,i]}.max.last
    probs[ix] = 0
    probs = [probs[0] * (1 + PSTAY)/2 + probs[1] * (1 - PSTAY)/2,
             *probs.each_cons(3).map{|a, b, c| (a + c) / 2 * (1 - PSTAY) + b * PSTAY},
             probs[-1] * (1 + PSTAY)/2 + probs[-2] * (1 - PSTAY)/2]

    y.yield ix
  end
end

$sg_cache = []
def sg_enum; Enumerator.new{|y| $sg_cache.each{|n| y.yield n}; $sg_gen.each{|n| $sg_cache.push n; y.yield n}}; end

puts "smart greedy"
p stats WalkSim.bench(LIMIT, PSTAY, RUNS) {sg_enum}

117

answered Oct 06 '22 02:10

John Dvorak

Related questions
                            
                                Initializing an array inside a structure from another array in C++ [closed]
                            
                                Are PHP Integer array Indexes really numeric?
                            
                                How to call a function with a reference to an unknown size array?
                            
                                float64 to float32 Cython Error
                            
                                Sort an array based on data in another array
                            
                                Hash integer array
                            
                                Best way to sort data [duplicate]
                            
                                Find elements which need to be removed from an array such that 2*min>max
                            
                                C - cache lines and association
                            
                                Python: How to extend a huge class with minimum lines of code?
                            
                                Node.js Promises: Push to array asynchronously & save
                            
                                Is there a way to use ARRAYs in Entity Framework + PostgreSql
                            
                                The best performant way to push items into array?
                            
                                How to properly align a 5x10 2-d array with random data in console?
                            
                                Fill array with zeros
                            
                                Function that returns array of array combinations
                            
                                Is it possible to carry out a function every time an Array is created?
                            
                                PHP Match Array against Partial String
                            
                                Roslyn ObjectPool struct wrapper
                            
                                How to do c++ aligned array allocation?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Find an element in an array, but the element can jump

Tags:

arrays

algorithm

Kent

People also ask

Video Answer

1 Answers

John Dvorak

Recent Activity

Donate For Us