Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memoization algorithm time complexity

I read this article Retiring a Great Interview Problem, the author came up with a work break problem and gave three solutions. The efficient one uses memoization algorithm and the author said its worst case time complexity is O(n^2) since the key insight is that SegmentString is only called on suffixes of the original input string, and that there are only O(n) suffixes.

However, I find it difficult for me to understand why it is O(n^2). Could someone please give me a hint or proof?

Work Break Problem: 
    Given an input string and a dictionary of words,
    segment the input string into a space-separated
    sequence of dictionary words if possible. For
    example, if the input string is "applepie" and
    dictionary contains a standard set of English words,
    then we would return the string "apple pie" as output.

The memoization algorithm from Retiring a Great Interview Problem

Map<String, String> memoized;

String SegmentString(String input, Set<String> dict) {
      if (dict.contains(input)) 
          return input;
      if (memoized.containsKey(input) {
          return memoized.get(input);
      }
      int len = input.length();
      for (int i = 1; i < len; i++) {
          String prefix = input.substring(0, i);
          if (dict.contains(prefix)) {
              String suffix = input.substring(i, len);
              String segSuffix = SegmentString(suffix, dict);
              if (segSuffix != null) {
                  return prefix + " " + segSuffix;
              }
          }
      }
      memoized.put(input, null);
      return null;
}
like image 246
mitchelllc Avatar asked Jan 22 '14 03:01

mitchelllc


People also ask

What is the time complexity of memoization?

Because no node is called more than once, this dynamic programming strategy known as memoization has a time complexity of O(N), not O(2^N).

Does memoization lower algorithmic complexity?

With recursion, the trick of using Memoization the cache results will often dramatically improve the time complexity of the problem.

Does memoization increase space complexity?

Explanation: As the mentioned approach uses the memoization technique it always stores the previously calculated values. Due to this, the time complexity is decreased but the space complexity is increased.

What is the time complexity of Fibonacci with memoization?

Memoized time complexity calculation Let's take time complexity of n to be T(n) , hence T(n) = T(n-1) + T(n-2) . This is because F(n-2) is in the cache when we calculate F(n - 1) . Therefore, the function of F(n-2) is 1 (reading from cached equation), hence T(n) = T(n-1) + 1 = T(n-2) + 2 = ... = T(n-n) + n.


1 Answers

Intuition

(it is clear that the worst case is the one where there are no solutions, so I focus on that)

Since the recursive call is made before putting values in the memoization-cache, the last (shortest) suffixes will get cached first. This is because the function is first called with a string of length N, which then calls itself with a string of length N-1, which then .... , with a string of len 0, which is cached and returns, then the string of length 1 is cached and returns, ..., length N is cached and returns.

As the hint suggests, only suffixes get cached, and there are only N of them. This means that by the time the top-level function gets the result of its first recursive call (i.e. on a suffix of length N-1), the cache is already populated with the N-1 suffixes.

Now, assuming the N-1 last suffixes are already cached, the for-loop needs to make N-1 recursive calls, each taking O(1) (since the answer is already cached), totalling O(N). However, (pre-) building the cache of the last N-1 takes O(N^2) (explained below), totalling with O(N)+O(N^2) = O(N^2).


Proof by mathematical induction

This explanation can easily be translated to a formal proof using induction. Here is the gist of it:

(f(N) being the number of operations required for the function to complete on an input of length N)

Induction hypothesis -- there exists a constant c s.t. f(N) < c*N^2.

The base case is trivial -- for a string of length 1, you can find a constant c such that f(1) < c*N^2 = c

Induction step

Recap of the order things happen:

Step 1: the function first calls itself on the suffix of length N-1, building a cache containing the answer for the last N-1 suffixes

Step 2: the function then calls itself O(N) more times, each taking O(1) (thanks to this test: if (memoized.containsKey(input)), and the fact that the cache has already been populated in Step 1).

So we get:

f(N+1) = f(N) + a*N <   (by the hypothesis)
       < c*N^2 + a*N <  (for c large enough)
       < c*N^2 + 2*c*N + c =
       = c*(N+1)^2

Thus we got f(N+1) < c*(N+1)^2, which completes the proof.

like image 189
shx2 Avatar answered Nov 09 '22 08:11

shx2