Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

No. of distinct subsequences of length 3 in an array of length n

How to calculate the number of distinct sub sequences of length 3 (or in general of length k < n) in an array of length n?

Note: Two sub sequences are considered different if the order of elements in them are different.

Ex: Suppose the array A = [1, 2, 1, 1], then the answer should be 3 because there are only three distinct subsequences of length 3 as shown below:

[1, 1, 1]
[1, 2, 1]
[2, 1, 1]

Size of the array n <= 10^5, each element in the array A_i <= n.

My approach:

I figured the brute force approach, i.e., to take tuples of length 3 and insert it into a map. But this is neither space/time efficient.

Edit: This was an interview question and it said that for k = 3 the expected time and space complexity is O(n).

like image 499
dodobhoot Avatar asked Jun 05 '18 14:06

dodobhoot


People also ask

How do you find the number of subsequences?

Given a string, find the count of distinct subsequences of it. The problem of counting distinct subsequences is easy if all characters of input string are distinct. The count is equal to nC0 + nC1 + nC2 + … nCn = 2n. How to count distinct subsequences when there can be repetition in input string?

How do you find the subsequences of length k?

Given an array arr[]and an integer K, the task is to find the sum of all K length subsequences from the given array. Explanation: There are 6 subsequences of length 2 which are {7, 8}, {7, 9}, {7, 2}, {8, 9}, {8, 2} and {9, 2}.

How many subsequences are there in a sequence?

Bookmark this question. Show activity on this post. If anyone here is familiar with the Lowest Common Subsequence problem, they probably know that the number of posibble subsequences in a sequence is 2n; n being the length of the sequence.


2 Answers

As is often the case with interview problems, there's a dynamic programming solution. Let T(m, k) be the number of distinct length-k subsequences of the first m elements. Then assuming one-based indexing on the input A, we have a 2D recurrence

T(m, 0) = 1
T(m, k) = T(m-1, k) + T(m-1, k-1) -
          ^^^^^^^^^   ^^^^^^^^^^^
     A_m not chosen   A_m chosen

            { T(i-1, k-1), if i < m is the maximum index where A_i = A_m
            { 0,           if no such index exists

The subtracted term ensures that we don't count duplicates; see https://stackoverflow.com/a/5152203/2144669 for more explanation.

The running time (with a hash map to maintain the rightmost occurrence so far of each symbol seen) is O(k n), which is O(n) for k = 3.

like image 175
David Eisenstat Avatar answered Nov 16 '22 03:11

David Eisenstat


Here's a slightly different take. We can think of the number of ways an element, m, can be kth in the subsequence as the sum of all the ways the previous occurence of any element (including m) can be (k-1)th. As we move right, however, the only update needed is for m; the other sums stay constant.

For example,

// We want to avoid counting [1,1,1], [1,2,1], etc. twice
[1, 2, 1, 1, 1]

(display the array vertically for convenience)

            <-  k  ->
[1,  ->  1: [1, 0, 0]
 2,  ->  2: [1, 1, 0]
 1,  ->  1: [1, 2, 1]
 1,  ->  1: [1, 2, 3]
 1]  ->  1: [1, 2, 3]

Now if we added another element, say 3,

...
 3]  ->  3: [1, 2, 3]

 // 1 means there is one way
 // the element, 3, can be first

 // 2 means there are 2 ways
 // 3 can be second: sum distinct
 // column k[0] = 1 + 1 = 2

 // 3 means there are 3 ways
 // 3 can be third: sum distinct
 // column k[1] = 2 + 1 = 3

Sum distinct k[2] column:

0 + 3 + 3 = 6 subsequences

[1,2,1], [2,1,1], [1,1,1]
[1,1,3], [2,1,3], [3,2,1]

The sum-distinct for each column can be updated in O(1) per iteration. The k sums for the current element (we update a single list of those for each element), take O(k), which in our case is O(1).

JavaScript code:

function f(A, k){
  A.unshift(null);
  
  let sumDistinct = new Array(k + 1).fill(0);
  let hash = {};

  sumDistinct[0] = 1;

  for (let i=1; i<A.length; i++){
    let newElement;
    
    if (!hash[A[i]]){
      hash[A[i]] = new Array(k + 1).fill(0);
      newElement = true;
    }
    
    let prev = hash[A[i]].slice();

    // The number of ways an element, m, can be k'th
    // in the subsequence is the sum of all the ways
    // the previous occurence of any element
    // (including m) can be (k-1)'th
    for (let j=1; j<=k && j<=i; j++)
      hash[A[i]][j] = sumDistinct[j - 1];

    for (let j=2; j<=k && j<=i; j++)
      sumDistinct[j] = sumDistinct[j] - prev[j] + hash[A[i]][j];

    if (newElement)
      sumDistinct[1] += 1;

    console.log(JSON.stringify([A[i], hash[A[i]], sumDistinct]))
  }

  return sumDistinct[k];
}

var arr = [1, 2, 1, 1, 1, 3, 2, 1];

console.log(f(arr, 3));
like image 35
גלעד ברקן Avatar answered Nov 16 '22 03:11

גלעד ברקן