Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Google Interview: Find all contiguous subsequence in a given array of integers, whose sum falls in the given range. Can we do better than O(n^2)?

Tags:

Given an array of Integers, and a range (low, high), find all contiguous subsequence in the array which have sum in the range.

Is there a solution better than O(n^2)?

I tried a lot but couldn't find a solution that does better than O(n^2). Please help me find a better solution or confirm that this is the best we can do.

This is what I have right now, I'm assuming the range to be defined as [lo, hi].

public static int numOfCombinations(final int[] data, final int lo, final int hi, int beg, int end) {     int count = 0, sum = data[beg];      while (beg < data.length && end < data.length) {        if (sum > hi) {           break;        } else {           if (lo <= sum && sum <= hi) {             System.out.println("Range found: [" + beg + ", " + end + "]");             ++count;           }           ++end;           if (end < data.length) {              sum += data[end];           }        }     }     return count; }  public static int numOfCombinations(final int[] data, final int lo, final int hi) {     int count = 0;      for (int i = 0; i < data.length; ++i) {         count += numOfCombinations(data, lo, hi, i, i);     }      return count; } 
like image 551
user1071840 Avatar asked Jul 03 '14 01:07

user1071840


People also ask

What is contiguous subsequence?

A contiguous subsequence of a list S is a subsequence made up of consecutive elements of S. If S is {5, 15, -30, 10, -5, 40, 10} then 15, -30, 10 is a contiguous subsequence.


2 Answers

O(n) time solution:

You can extend the 'two pointer' idea for the 'exact' version of the problem. We will maintain variables a and b such that all intervals on the form xs[i,a), xs[i,a+1), ..., xs[i,b-1) have a sum in the sought after range [lo, hi].

a, b = 0, 0 for i in range(n):     while a != (n+1) and sum(xs[i:a]) < lo:         a += 1     while b != (n+1) and sum(xs[i:b]) <= hi:         b += 1     for j in range(a, b):         print(xs[i:j]) 

This is actually O(n^2) because of the sum, but we can easily fix that by first calculating the prefix sums ps such that ps[i] = sum(xs[:i]). Then sum(xs[i:j]) is simply ps[j]-ps[i].

Here is an example of running the above code on [2, 5, 1, 1, 2, 2, 3, 4, 8, 2] with [lo, hi] = [3, 6]:

[5] [5, 1] [1, 1, 2] [1, 1, 2, 2] [1, 2] [1, 2, 2] [2, 2] [2, 3] [3] [4] 

This runs in time O(n + t), where t is the size of the output. As some have noticed, the output can be as large as t = n^2, namely if all contiguous subsequences are matched.

If we allow writing the output in a compressed format (output pairs a,b of which all subsequences are contiguous) we can get a pure O(n) time algorithm.

like image 157
Thomas Ahle Avatar answered Nov 20 '22 06:11

Thomas Ahle


Starting from this problem: find all contiguous sub-sequences that sum to x. What we need is something similar.

For every index i, we can calculate the sum of the segment from 0 to i, which is x. So, the problem now is we need to find from 0 to i - 1, how many segments have sum from (x - low) to (x - high), and it should be faster than O(n). So there are several data structures help you to do that in O(logn), which are Fenwick tree and Interval tree.

So what we need to do is:

  • Iterating through all index from 0 to n (n is the size of the array).

  • At index ith, calculate, starting from 0 to ith index, the sum x, query the tree to get the total occurrences of numbers fall in the range (x - high, x - low).

  • Add x to the tree.

So the time complexity will be O(n log n)

like image 29
Pham Trung Avatar answered Nov 20 '22 04:11

Pham Trung