This was an interview question. Initially i found it to be very easy and silly. At-last i failed to solve this. :(
The question is, we have an array which has sequence of 0's followed by sequence of 1's and a sequence of 0's. something like this.
int [] arr ={0,0,0,1,1,1,0,0,0}
Now, find the number of 1's in the array in log(n) time.
AnyIdea ? :(
You can't. You currently have only three assumptions on the array:
1
,1
,1
s, there is no 0
between them.In order to find something in a array, you can use linear search and binary* search. Linear search doesn't seem appropriate, since you want to achieve logarithmic time.
However, for binary search, you need that arr[i] <= arr[j]
for all i < j
. Since this isn't the case, you don't know which half contains the 1
. This means that you would need to check both sides, which leads to linear complexity.
So why doesn't any kind of binary search work? Before I answer this question, lets put in a quick section about how binary search actually works, since there seems to be a little bit of confusion:
Binary search works so well because it can reduce the size of the problem heavily: in each iteration the problem size gets halved.
For example, lets say we're looking for 5
in the ordered sequence 000011223334456
.
This works perfectly, since we get the following sequence: we first check the middle, which is 2. Therefore we know that the solution is in the right side, and we never have to check the left side. The middle of the right side is four, so we can again cut off the left half. The next middle is 5. We stop:
In the image, the red section will never be inspected. Never. For the complexity, let n be our original problem size. After a single step, our problem has size n/2
. After the k-th step, it has size n/(2^k). So if k = log2 (n), we have reduced our problem to size one. Since we checked only one value in each step, and we have a total of log2(n) steps (rounded up to the next integral), we have logarithmic time complexity.
The short answer is: because your problem isn't sorted. Lets try to use binary search:
What happens after a single step?
Inspecting the middle value doesn't give us any information at all, except for that we know that it isn't a 1
. We don't know whether we need to traverse the left or the right tree. Therefore we need to traverse both of them. And in order to create a deterministic algorithm, we need to fix the order of traversal for all applications. If we traverse right-to-left, we find it relatively fast:
But if the 1
was at the left, we would check almost all elements. You also note that we can't rule out as many nodes as we could in the real binary search.
By the way, the alternating variants also suffer from the same problem, since alternating only means level-based-traversal. Instead of following a path, you would check the nodes based on their levels:
Some of the comments propose parallel/simultaneously search in both trees. While this actually cuts down the total time, time complexity is measured in the sense of turing machines. And at one point or another you're going to run out of strips, or CPUs. Remember folks, this is about theoretical computational time complexity.
If you can't find a single value in logarithmic time, you cannot find a pair of values e.g. (0,1) in logarithmic time. But if you know the position of a single 1, then the left side and and the right side are ordered sets, since they are 000....011..11
and 11....1100...00
and one can use binary search.
After all the discussion, it should be clear that we need linear runtime to find even a single 1.
However, together with an assumption, we can find the edges in logarithmic time and subtract their positions:
1
at position k
(your example suggests one at size/2
)If you're wondering how this helps, have a look at the previous section.
However, without the additional assumption it can't be done.
* or any other n-ary search (n > 2), but they all have logarithmic cost
Best solution I can come up with:
Step 2 is O(log n) trivially. The problem is that step 1 is O(n/k) where k is the number of 1s in the array. This is O(n) if the number of 1s is unrestricted in any way, but if instead the number of 1s is required to be a certain fraction of the array (at least 10% or at least 1% or any lower bound linear in n), then it becomes O(1) average case (though still O(n) worst case)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With