Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximizing a particular sum over all possible subarrays

Consider an array like this one below:

  {1, 5, 3, 5, 4, 1}

When we choose a subarray, we reduce it to the lowest number in the subarray. For example, the subarray {5, 3, 5} becomes {3, 3, 3}. Now, the sum of the subarray is defined as the sum of the resultant subarray. For example, {5, 3, 5} the sum is 3 + 3 + 3 = 9. The task is to find the largest possible sum that can be made from any subarray. For the above array, the largest sum is 12, given by the subarray {5, 3, 5, 4}.

Is it possible to solve this problem in time better than O(n2)?

like image 254
AvinashK Avatar asked Mar 09 '13 06:03

AvinashK


People also ask

How do you find the maximum sum of all Subarrays?

Simple Approach: Run a loop for i from 0 to n – 1, where n is the size of the array. Now, we will run a nested loop for j from i to n – 1 and add the value of the element at index j to a variable currentMax. Lastly, for every subarray, we will check if the currentMax is the maximum sum of all contiguous subarrays.

How do you find the maximum of all subarrays of an array?

Assuming you mean contiguous sub-arrays, create the array of partial sums where Yi = SUM(i=0..i)Xi, so from 1,4,2,3 create 0,1,1+4=5,1+4+2=7,1+4+2+3=10. You can create this from left to right in linear time, and the value of any contiguous subarray is one partial sum subtracted from another, so 4+2+3 = 1+4+2+3 - 1= 9.

How do you find the number of Subarrays whose Max K is?

int count1 = totalSubarrays(arr, n, k - 1); // Stores count of subarrays with max >= k + 1. int count2 = totalSubarrays(arr, n, k); // Stores count of subarrays with max = k.

What is the sum of its maximum subarray?

For arrays with no negative integers, the maximum subarray sum is the sum of the array in itself.


2 Answers

I believe that I have an algorithm for this that runs in O(n) time. I'll first describe an unoptimized version of the algorithm, then give a fully optimized version.

For simplicity, let's initially assume that all values in the original array are distinct. This isn't true in general, but it gives a good starting point.

The key observation behind the algorithm is the following. Find the smallest element in the array, then split the array into three parts - all elements to the left of the minimum, the minimum element itself, and all elements to the right of the minimum. Schematically, this would look something like

 +-----------------------+-----+-----------------------+
 |     left values       | min |      right values     |
 +-----------------------+-----+-----------------------+     

Here's the key observation: if you take the subarray that gives the optimum value, one of three things must be true:

  1. That array consists of all the values in the array, including the minimum value. This has total value min * n, where n is the number of elements.
  2. That array does not include the minimum element. In that case, the subarray has to be purely to the left or to the right of the minimum value and cannot include the minimum value itself.

This gives a nice initial recursive algorithm for solving this problem:

  • If the sequence is empty, the answer is 0.
  • If the sequence is nonempty:
    • Find the minimum value in the sequence.
    • Return the maximum of the following:
      • The best answer for the subarray to the left of the minimum.
      • The best answer for the subarray to the right of the minimum.
      • The number of elements times the minimum.

So how efficient is this algorithm? Well, that really depends on where the minimum elements are. If you think about it, we do linear work to find the minimum, then divide the problem into two subproblems and recurse on each. This is the exact same recurrence you get when considering quicksort. This means that in the best case it will take Θ(n log n) time (if we always have the minimum element in the middle of each half), but in the worst case it will take Θ(n2) time (if we always have the minimum value purely on the far left or the far right.

Notice, however, that all of the effort we're spending is being used to find the minimum value in each of the subarrays, which takes O(k) time for k elements. What if we could speed this up to O(1) time? In that case, our algorithm would do a lot less work. More specifically, it would do only O(n) work. The reason for this is the following: each time we make a recursive call, we do O(1) work to find the minimum element, then remove that element from the array and recursively process the remaining pieces. Each element can therefore be the minimum element of at most one of the recursive calls, and so the total number of recursive calls can't be any greater than the number of elements. This means that we make at most O(n) calls that each do O(1) work, which gives a total of O(1) work.

So how exactly do we get this magical speedup? This is where we get to use a surprisingly versatile and underappreciated data structure called the Cartesian tree. A Cartesian tree is a binary tree created out of a sequence of elements that has the following properties:

  • Each node is smaller than its children, and
  • An inorder walk of the Cartesian tree gives back the elements of the sequence in the order in which they appear.

For example, the sequence 4 6 7 1 5 0 2 8 3 has this Cartesian tree:

       0
      / \
     1   2
    / \   \
   4   5   3
    \     /
     6   8
      \
       7

And here's where we get the magic. We can immediately find the minimum element of the sequence by just looking at the root of the Cartesian tree - that takes only O(1) time. Once we've done that, when we make our recursive calls and look at all the elements to the left of or to the right of the minimum element, we're just recursively descending into the left and right subtrees of the root node, which means that we can read off the minimum elements of those subarrays in O(1) time each. Nifty!

The real beauty is that it is possible to construct a Cartesian tree for a sequence of n elements in O(n) time. This algorithm is detailed in this section of the Wikipedia article. This means that we can get a super fast algorithm for solving your original problem as follows:

  • Construct a Cartesian tree for the array.
  • Use the above recursive algorithm, but use the Cartesian tree to find the minimum element rather than doing a linear scan each time.

Overall, this takes O(n) time and uses O(n) space, which is a time improvement over the O(n2) algorithm you had initially.

At the start of this discussion, I made the assumption that all array elements are distinct, but this isn't really necessary. You can still build a Cartesian tree for an array with non-distinct elements in it by changing the requirement that each node is smaller than its children to be that each node is no bigger than its children. This doesn't affect the correctness of the algorithm or its runtime; I'll leave that as the proverbial "exercise to the reader." :-)

This was a cool problem! I hope this helps!

like image 80
templatetypedef Avatar answered Sep 28 '22 17:09

templatetypedef


Assuming that the numbers are all non-negative, isn't this just the "maximize the rectangle area in a histogram" problem? which has now become famous...

O(n) solutions are possible. This site: http://blog.csdn.net/arbuckle/article/details/710988 has a bunch of neat solutions.

To elaborate what I am thinking (it might be incorrect) think of each number as histogram rectangle of width 1.

By "minimizing" a subarray [i,j] and adding up, you are basically getting the area of the rectangle in the histogram which spans from i to j.

This has appeared before on SO: Maximize the rectangular area under Histogram, you find code and explanation, and a link to the official solutions page (http://www.informatik.uni-ulm.de/acm/Locals/2003/html/judge.html).

like image 28
Knoothe Avatar answered Sep 28 '22 16:09

Knoothe