Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find number of 1's in sequence of 0's and 1's

This was an interview question. Initially i found it to be very easy and silly. At-last i failed to solve this. :(

The question is, we have an array which has sequence of 0's followed by sequence of 1's and a sequence of 0's. something like this.

   int [] arr ={0,0,0,1,1,1,0,0,0}

Now, find the number of 1's in the array in log(n) time.

AnyIdea ? :(

like image 831
Jeevi Avatar asked Jul 02 '14 13:07

Jeevi


2 Answers

Well, how to put this…

You can't. You currently have only three assumptions on the array:

  • it doesn't start with a 1,
  • it doesn't end with a 1,
  • for any two 1s, there is no 0 between them.

In order to find something in a array, you can use linear search and binary* search. Linear search doesn't seem appropriate, since you want to achieve logarithmic time. However, for binary search, you need that arr[i] <= arr[j] for all i < j. Since this isn't the case, you don't know which half contains the 1. This means that you would need to check both sides, which leads to linear complexity.

Why n-ary (n >= 2) search doesn't work

So why doesn't any kind of binary search work? Before I answer this question, lets put in a quick section about how binary search actually works, since there seems to be a little bit of confusion:

How does binary search work?

Binary search works so well because it can reduce the size of the problem heavily: in each iteration the problem size gets halved.

For example, lets say we're looking for 5 in the ordered sequence 000011223334456.

initial search tree

This works perfectly, since we get the following sequence: we first check the middle, which is 2. Therefore we know that the solution is in the right side, and we never have to check the left side. The middle of the right side is four, so we can again cut off the left half. The next middle is 5. We stop:

search going on

In the image, the red section will never be inspected. Never. For the complexity, let n be our original problem size. After a single step, our problem has size n/2. After the k-th step, it has size n/(2^k). So if k = log2 (n), we have reduced our problem to size one. Since we checked only one value in each step, and we have a total of log2(n) steps (rounded up to the next integral), we have logarithmic time complexity.

Why binary search doesn't work in your situation

The short answer is: because your problem isn't sorted. Lets try to use binary search:

oh oh

What happens after a single step?

middle

Inspecting the middle value doesn't give us any information at all, except for that we know that it isn't a 1. We don't know whether we need to traverse the left or the right tree. Therefore we need to traverse both of them. And in order to create a deterministic algorithm, we need to fix the order of traversal for all applications. If we traverse right-to-left, we find it relatively fast:

whoops

But if the 1 was at the left, we would check almost all elements. You also note that we can't rule out as many nodes as we could in the real binary search.

By the way, the alternating variants also suffer from the same problem, since alternating only means level-based-traversal. Instead of following a path, you would check the nodes based on their levels:

alternating

Some of the comments propose parallel/simultaneously search in both trees. While this actually cuts down the total time, time complexity is measured in the sense of turing machines. And at one point or another you're going to run out of strips, or CPUs. Remember folks, this is about theoretical computational time complexity.

Why is it so important to find a single 1?

If you can't find a single value in logarithmic time, you cannot find a pair of values e.g. (0,1) in logarithmic time. But if you know the position of a single 1, then the left side and and the right side are ordered sets, since they are 000....011..11 and 11....1100...00 and one can use binary search.

So can one actually solve it in logarithmic time?

After all the discussion, it should be clear that we need linear runtime to find even a single 1.

However, together with an assumption, we can find the edges in logarithmic time and subtract their positions:

  • the array has a 1 at position k (your example suggests one at size/2)

If you're wondering how this helps, have a look at the previous section.

However, without the additional assumption it can't be done.


* or any other n-ary search (n > 2), but they all have logarithmic cost

like image 127
Zeta Avatar answered Sep 21 '22 03:09

Zeta


Best solution I can come up with:

  1. Use random probing or a random hash probe to find a 1 in the array
  2. Use binary searches from there to find the first and last 1s in the array

Step 2 is O(log n) trivially. The problem is that step 1 is O(n/k) where k is the number of 1s in the array. This is O(n) if the number of 1s is unrestricted in any way, but if instead the number of 1s is required to be a certain fraction of the array (at least 10% or at least 1% or any lower bound linear in n), then it becomes O(1) average case (though still O(n) worst case)

like image 40
Chris Dodd Avatar answered Sep 19 '22 03:09

Chris Dodd