Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to determine the longest increasing subsequence using dynamic programming?

People also ask

How do you find the longest increasing subsequence?

To find the LIS for a given array, we need to return max(L(i)) where 0 < i < n. Formally, the length of the longest increasing subsequence ending at index i, will be 1 greater than the maximum of lengths of all longest increasing subsequences ending at indices before i, where arr[j] < arr[i] (j < i).

How do you find the longest non decreasing subsequence?

Time: O(K * N/K * log(N/K)) = O(N * log(N/K)) , where N <= 10^5 is length of arr , K <= N . We have total K new arr, each array have up to N/K elements. We need O(M * logM) to find the longest non-decreasing subsequence of an array length M .

What is meant by longest increasing subsequence?

In computer science, the longest increasing subsequence problem is to find a subsequence of a given sequence in which the subsequence's elements are in sorted order, lowest to highest, and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous, or unique.


OK, I will describe first the simplest solution which is O(N^2), where N is the size of the collection. There also exists a O(N log N) solution, which I will describe also. Look here for it at the section Efficient algorithms.

I will assume the indices of the array are from 0 to N - 1. So let's define DP[i] to be the length of the LIS (Longest increasing subsequence) which is ending at element with index i. To compute DP[i] we look at all indices j < i and check both if DP[j] + 1 > DP[i] and array[j] < array[i] (we want it to be increasing). If this is true we can update the current optimum for DP[i]. To find the global optimum for the array you can take the maximum value from DP[0...N - 1].

int maxLength = 1, bestEnd = 0;
DP[0] = 1;
prev[0] = -1;

for (int i = 1; i < N; i++)
{
   DP[i] = 1;
   prev[i] = -1;

   for (int j = i - 1; j >= 0; j--)
      if (DP[j] + 1 > DP[i] && array[j] < array[i])
      {
         DP[i] = DP[j] + 1;
         prev[i] = j;
      }

   if (DP[i] > maxLength)
   {
      bestEnd = i;
      maxLength = DP[i];
   }
}

I use the array prev to be able later to find the actual sequence not only its length. Just go back recursively from bestEnd in a loop using prev[bestEnd]. The -1 value is a sign to stop.


OK, now to the more efficient O(N log N) solution:

Let S[pos] be defined as the smallest integer that ends an increasing sequence of length pos. Now iterate through every integer X of the input set and do the following:

  1. If X > last element in S, then append X to the end of S. This essentially means we have found a new largest LIS.

  2. Otherwise find the smallest element in S, which is >= than X, and change it to X. Because S is sorted at any time, the element can be found using binary search in log(N).

Total runtime - N integers and a binary search for each of them - N * log(N) = O(N log N)

Now let's do a real example:

Collection of integers: 2 6 3 4 1 2 9 5 8

Steps:

0. S = {} - Initialize S to the empty set
1. S = {2} - New largest LIS
2. S = {2, 6} - New largest LIS
3. S = {2, 3} - Changed 6 to 3
4. S = {2, 3, 4} - New largest LIS
5. S = {1, 3, 4} - Changed 2 to 1
6. S = {1, 2, 4} - Changed 3 to 2
7. S = {1, 2, 4, 9} - New largest LIS
8. S = {1, 2, 4, 5} - Changed 9 to 5
9. S = {1, 2, 4, 5, 8} - New largest LIS

So the length of the LIS is 5 (the size of S).

To reconstruct the actual LIS we will again use a parent array. Let parent[i] be the predecessor of an element with index i in the LIS ending at the element with index i.

To make things simpler, we can keep in the array S, not the actual integers, but their indices(positions) in the set. We do not keep {1, 2, 4, 5, 8}, but keep {4, 5, 3, 7, 8}.

That is input[4] = 1, input[5] = 2, input[3] = 4, input[7] = 5, input[8] = 8.

If we update properly the parent array, the actual LIS is:

input[S[lastElementOfS]], 
input[parent[S[lastElementOfS]]],
input[parent[parent[S[lastElementOfS]]]],
........................................

Now to the important thing - how do we update the parent array? There are two options:

  1. If X > last element in S, then parent[indexX] = indexLastElement. This means the parent of the newest element is the last element. We just prepend X to the end of S.

  2. Otherwise find the index of the smallest element in S, which is >= than X, and change it to X. Here parent[indexX] = S[index - 1].


Petar Minchev's explanation helped clear things up for me, but it was hard for me to parse what everything was, so I made a Python implementation with overly-descriptive variable names and lots of comments. I did a naive recursive solution, the O(n^2) solution, and the O(n log n) solution.

I hope it helps clear up the algorithms!

The Recursive Solution

def recursive_solution(remaining_sequence, bigger_than=None):
    """Finds the longest increasing subsequence of remaining_sequence that is      
    bigger than bigger_than and returns it.  This solution is O(2^n)."""

    # Base case: nothing is remaining.                                             
    if len(remaining_sequence) == 0:
        return remaining_sequence

    # Recursive case 1: exclude the current element and process the remaining.     
    best_sequence = recursive_solution(remaining_sequence[1:], bigger_than)

    # Recursive case 2: include the current element if it's big enough.            
    first = remaining_sequence[0]

    if (first > bigger_than) or (bigger_than is None):

        sequence_with = [first] + recursive_solution(remaining_sequence[1:], first)

        # Choose whichever of case 1 and case 2 were longer.                         
        if len(sequence_with) >= len(best_sequence):
            best_sequence = sequence_with

    return best_sequence                                                        

The O(n^2) Dynamic Programming Solution

def dynamic_programming_solution(sequence):
    """Finds the longest increasing subsequence in sequence using dynamic          
    programming.  This solution is O(n^2)."""

    longest_subsequence_ending_with = []
    backreference_for_subsequence_ending_with = []
    current_best_end = 0

    for curr_elem in range(len(sequence)):
        # It's always possible to have a subsequence of length 1.                    
        longest_subsequence_ending_with.append(1)

        # If a subsequence is length 1, it doesn't have a backreference.             
        backreference_for_subsequence_ending_with.append(None)

        for prev_elem in range(curr_elem):
            subsequence_length_through_prev = (longest_subsequence_ending_with[prev_elem] + 1)

            # If the prev_elem is smaller than the current elem (so it's increasing)   
            # And if the longest subsequence from prev_elem would yield a better       
            # subsequence for curr_elem.                                               
            if ((sequence[prev_elem] < sequence[curr_elem]) and
                    (subsequence_length_through_prev >
                         longest_subsequence_ending_with[curr_elem])):

                # Set the candidate best subsequence at curr_elem to go through prev.    
                longest_subsequence_ending_with[curr_elem] = (subsequence_length_through_prev)
                backreference_for_subsequence_ending_with[curr_elem] = prev_elem
                # If the new end is the best, update the best.    

        if (longest_subsequence_ending_with[curr_elem] >
                longest_subsequence_ending_with[current_best_end]):
            current_best_end = curr_elem
            # Output the overall best by following the backreferences.  

    best_subsequence = []
    current_backreference = current_best_end

    while current_backreference is not None:
        best_subsequence.append(sequence[current_backreference])
        current_backreference = (backreference_for_subsequence_ending_with[current_backreference])

    best_subsequence.reverse()

    return best_subsequence                                                   

The O(n log n) Dynamic Programming Solution

def find_smallest_elem_as_big_as(sequence, subsequence, elem):
    """Returns the index of the smallest element in subsequence as big as          
    sequence[elem].  sequence[elem] must not be larger than every element in       
    subsequence.  The elements in subsequence are indices in sequence.  Uses       
    binary search."""

    low = 0
    high = len(subsequence) - 1

    while high > low:
        mid = (high + low) / 2
        # If the current element is not as big as elem, throw out the low half of    
        # sequence.                                                                  
        if sequence[subsequence[mid]] < sequence[elem]:
            low = mid + 1
            # If the current element is as big as elem, throw out everything bigger, but 
        # keep the current element.                                                  
        else:
            high = mid

    return high


def optimized_dynamic_programming_solution(sequence):
    """Finds the longest increasing subsequence in sequence using dynamic          
    programming and binary search (per                                             
    http://en.wikipedia.org/wiki/Longest_increasing_subsequence).  This solution   
    is O(n log n)."""

    # Both of these lists hold the indices of elements in sequence and not the        
    # elements themselves.                                                         
    # This list will always be sorted.                                             
    smallest_end_to_subsequence_of_length = []

    # This array goes along with sequence (not                                     
    # smallest_end_to_subsequence_of_length).  Following the corresponding element 
    # in this array repeatedly will generate the desired subsequence.              
    parent = [None for _ in sequence]

    for elem in range(len(sequence)):
        # We're iterating through sequence in order, so if elem is bigger than the   
        # end of longest current subsequence, we have a new longest increasing          
        # subsequence.                                                               
        if (len(smallest_end_to_subsequence_of_length) == 0 or
                    sequence[elem] > sequence[smallest_end_to_subsequence_of_length[-1]]):
            # If we are adding the first element, it has no parent.  Otherwise, we        
            # need to update the parent to be the previous biggest element.            
            if len(smallest_end_to_subsequence_of_length) > 0:
                parent[elem] = smallest_end_to_subsequence_of_length[-1]
            smallest_end_to_subsequence_of_length.append(elem)
        else:
            # If we can't make a longer subsequence, we might be able to make a        
            # subsequence of equal size to one of our earlier subsequences with a         
            # smaller ending number (which makes it easier to find a later number that 
            # is increasing).                                                          
            # Thus, we look for the smallest element in                                
            # smallest_end_to_subsequence_of_length that is at least as big as elem       
            # and replace it with elem.                                                
            # This preserves correctness because if there is a subsequence of length n 
            # that ends with a number smaller than elem, we could add elem on to the   
            # end of that subsequence to get a subsequence of length n+1.              
            location_to_replace = find_smallest_elem_as_big_as(sequence, smallest_end_to_subsequence_of_length, elem)
            smallest_end_to_subsequence_of_length[location_to_replace] = elem
            # If we're replacing the first element, we don't need to update its parent 
            # because a subsequence of length 1 has no parent.  Otherwise, its parent  
            # is the subsequence one shorter, which we just added onto.                
            if location_to_replace != 0:
                parent[elem] = (smallest_end_to_subsequence_of_length[location_to_replace - 1])

    # Generate the longest increasing subsequence by backtracking through parent.  
    curr_parent = smallest_end_to_subsequence_of_length[-1]
    longest_increasing_subsequence = []

    while curr_parent is not None:
        longest_increasing_subsequence.append(sequence[curr_parent])
        curr_parent = parent[curr_parent]

    longest_increasing_subsequence.reverse()

    return longest_increasing_subsequence         

Speaking about DP solution, I found it surprising that no one mentioned the fact that LIS can be reduced to LCS. All you need to do is sort the copy of the original sequence, remove all the duplicates and do LCS of them. In pseudocode it is:

def LIS(S):
    T = sort(S)
    T = removeDuplicates(T)
    return LCS(S, T)

And the full implementation written in Go. You do not need to maintain the whole n^2 DP matrix if you do not need to reconstruct the solution.

func lcs(arr1 []int) int {
    arr2 := make([]int, len(arr1))
    for i, v := range arr1 {
        arr2[i] = v
    }
    sort.Ints(arr1)
    arr3 := []int{}
    prev := arr1[0] - 1
    for _, v := range arr1 {
        if v != prev {
            prev = v
            arr3 = append(arr3, v)
        }
    }

    n1, n2 := len(arr1), len(arr3)

    M := make([][]int, n2 + 1)
    e := make([]int, (n1 + 1) * (n2 + 1))
    for i := range M {
        M[i] = e[i * (n1 + 1):(i + 1) * (n1 + 1)]
    }

    for i := 1; i <= n2; i++ {
        for j := 1; j <= n1; j++ {
            if arr2[j - 1] == arr3[i - 1] {
                M[i][j] = M[i - 1][j - 1] + 1
            } else if M[i - 1][j] > M[i][j - 1] {
                M[i][j] = M[i - 1][j]
            } else {
                M[i][j] = M[i][j - 1]
            }
        }
    }

    return M[n2][n1]
}

The following C++ implementation includes also some code that builds the actual longest increasing subsequence using an array called prev.

std::vector<int> longest_increasing_subsequence (const std::vector<int>& s)
{
    int best_end = 0;
    int sz = s.size();

    if (!sz)
        return std::vector<int>();

    std::vector<int> prev(sz,-1);
    std::vector<int> memo(sz, 0);

    int max_length = std::numeric_limits<int>::min();

    memo[0] = 1;

    for ( auto i = 1; i < sz; ++i)
    {
        for ( auto j = 0; j < i; ++j)
        {
            if ( s[j] < s[i] && memo[i] < memo[j] + 1 )
            {
                memo[i] =  memo[j] + 1;
                prev[i] =  j;
            }
        }

        if ( memo[i] > max_length ) 
        {
            best_end = i;
            max_length = memo[i];
        }
    }

    // Code that builds the longest increasing subsequence using "prev"
    std::vector<int> results;
    results.reserve(sz);

    std::stack<int> stk;
    int current = best_end;

    while (current != -1)
    {
        stk.push(s[current]);
        current = prev[current];
    }

    while (!stk.empty())
    {
        results.push_back(stk.top());
        stk.pop();
    }

    return results;
}

Implementation with no stack just reverse the vector

#include <iostream>
#include <vector>
#include <limits>
std::vector<int> LIS( const std::vector<int> &v ) {
  auto sz = v.size();
  if(!sz)
    return v;
  std::vector<int> memo(sz, 0);
  std::vector<int> prev(sz, -1);
  memo[0] = 1;
  int best_end = 0;
  int max_length = std::numeric_limits<int>::min();
  for (auto i = 1; i < sz; ++i) {
    for ( auto j = 0; j < i ; ++j) {
      if (s[j] < s[i] && memo[i] < memo[j] + 1) {
        memo[i] = memo[j] + 1;
        prev[i] = j;
      }
    }
    if(memo[i] > max_length) {
      best_end = i;
      max_length = memo[i];
    }
  }

  // create results
  std::vector<int> results;
  results.reserve(v.size());
  auto current = best_end;
  while (current != -1) {
    results.push_back(s[current]);
    current = prev[current];
  }
  std::reverse(results.begin(), results.end());
  return results;
}

Here are three steps of evaluating the problem from dynamic programming point of view:

  1. Recurrence definition: maxLength(i) == 1 + maxLength(j) where 0 < j < i and array[i] > array[j]
  2. Recurrence parameter boundary: there might be 0 to i - 1 sub-sequences passed as a paramter
  3. Evaluation order: as it is increasing sub-sequence, it has to be evaluated from 0 to n

If we take as an example sequence {0, 8, 2, 3, 7, 9}, at index:

  • [0] we'll get subsequence {0} as a base case
  • [1] we have 1 new subsequence {0, 8}
  • [2] trying to evaluate two new sequences {0, 8, 2} and {0, 2} by adding element at index 2 to existing sub-sequences - only one is valid, so adding third possible sequence {0, 2} only to parameters list ...

Here's the working C++11 code:

#include <iostream>
#include <vector>

int getLongestIncSub(const std::vector<int> &sequence, size_t index, std::vector<std::vector<int>> &sub) {
    if(index == 0) {
        sub.push_back(std::vector<int>{sequence[0]});
        return 1;
    }

    size_t longestSubSeq = getLongestIncSub(sequence, index - 1, sub);
    std::vector<std::vector<int>> tmpSubSeq;
    for(std::vector<int> &subSeq : sub) {
        if(subSeq[subSeq.size() - 1] < sequence[index]) {
            std::vector<int> newSeq(subSeq);
            newSeq.push_back(sequence[index]);
            longestSubSeq = std::max(longestSubSeq, newSeq.size());
            tmpSubSeq.push_back(newSeq);
        }
    }
    std::copy(tmpSubSeq.begin(), tmpSubSeq.end(),
              std::back_insert_iterator<std::vector<std::vector<int>>>(sub));

    return longestSubSeq;
}

int getLongestIncSub(const std::vector<int> &sequence) {
    std::vector<std::vector<int>> sub;
    return getLongestIncSub(sequence, sequence.size() - 1, sub);
}

int main()
{
    std::vector<int> seq{0, 8, 2, 3, 7, 9};
    std::cout << getLongestIncSub(seq);
    return 0;
}