Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sum of maximum element of sliding window of length K

Tags:

algorithm

Recently I got stuck in a problem. The part of algorithm requires to compute sum of maximum element of sliding windows of length K. Where K ranges from 1<=K<=N (N length of an array).

Example if I have an array A as 5,3,12,4
Sliding window of length 1: 5 + 3 + 12 + 4 = 24
Sliding window of length 2: 5 + 12 + 12 = 29
Sliding window of length 3: 12 + 12 = 24
Sliding window of length 4: 12

Final answer is 24,29,24,12.

I have tried to this O(N^2). For each sliding window of length K, I can calculate the maximum in O(N). Since K is upto N. Therefore, overall complexity turns out to be O(N^2).
I am looking for O(N) or O(NlogN) or something similar to this algorithm as N maybe upto 10^5.
Note: Elements in array can be as large as 10^9 so output the final answer as modulo 10^9+7

EDIT: What I actually want to find answer for each and every value of K (i.e. from 0 to N) in overall linear time or in O(NlogN) not in O(KN) or O(KNlogN) where K={1,2,3,.... N}

like image 461
Mike Avatar asked Feb 08 '16 08:02

Mike


People also ask

How do you find the maximum sum of k elements in an array?

Given an array of integers and a number k, find the maximum sum of a subarray of size k. Examples: Input : arr[] = {100, 200, 300, 400} k = 2 Output : 700 Input : arr[] = {1, 4, 2, 10, 23, 3, 1, 0, 20} k = 4 Output : 39 We get maximum sum by adding subarray {4, 2, 10, 23} of size 4.

How many windows would exist in an array with N elements and window of size k?

There can be a total of N – K + 1 sliding window and there are K elements in each window.


2 Answers

Here's an abbreviated sketch of O(n).

For each element, determine how many contiguous elements to the left are no greater (call this a), and how many contiguous elements to the right are lesser (call this b). This can be done for all elements in time O(n) -- see MBo's answer.

A particular element is maximum in its window if the window contains the element and only elements among to a to its left and the b to its right. Usefully, the number of such windows of length k (and hence the total contribution of these windows) is piecewise linear in k, with at most five pieces. For example, if a = 5 and b = 3, there are

1 window  of size 1
2 windows of size 2
3 windows of size 3
4 windows of size 4
4 windows of size 5
4 windows of size 6
3 windows of size 7
2 windows of size 8
1 window  of size 9.

The data structure that we need to encode this contribution efficiently is a Fenwick tree whose values are not numbers but linear functions of k. For each linear piece of the piecewise linear contribution function, we add it to the cell at beginning of its interval and subtract it from the cell at the end (closed beginning, open end). At the end, we retrieve all of the prefix sums and evaluate them at their index k to get the final array.

(OK, have to run for now, but we don't actually need a Fenwick tree for step two, which drops the complexity to O(n) for that, and there may be a way to do step one in linear time as well.)

Python 3, lightly tested:

def left_extents(lst):
  result = []
  stack = [-1]
  for i in range(len(lst)):
    while stack[-1] >= 0 and lst[i] >= lst[stack[-1]]:
      del stack[-1]
    result.append(stack[-1] + 1)
    stack.append(i)
  return result


def right_extents(lst):
  result = []
  stack = [len(lst)]
  for i in range(len(lst) - 1, -1, -1):
    while stack[-1] < len(lst) and lst[i] > lst[stack[-1]]:
      del stack[-1]
    result.append(stack[-1])
    stack.append(i)
  result.reverse()
  return result


def sliding_window_totals(lst):
  delta_constant = [0] * (len(lst) + 2)
  delta_linear = [0] * (len(lst) + 2)
  for l, i, r in zip(left_extents(lst), range(len(lst)), right_extents(lst)):
    a = i - l
    b = r - (i + 1)
    if a > b:
      a, b = b, a
    delta_linear[1] += lst[i]
    delta_linear[a + 1] -= lst[i]
    delta_constant[a + 1] += lst[i] * (a + 1)
    delta_constant[b + 2] += lst[i] * (b + 1)
    delta_linear[b + 2] -= lst[i]
    delta_linear[a + b + 2] += lst[i]
    delta_constant[a + b + 2] -= lst[i] * (a + 1)
    delta_constant[a + b + 2] -= lst[i] * (b + 1)
  result = []
  constant = 0
  linear = 0
  for j in range(1, len(lst) + 1):
    constant += delta_constant[j]
    linear += delta_linear[j]
    result.append(constant + linear * j)
  return result

print(sliding_window_totals([5, 3, 12, 4]))
like image 56
David Eisenstat Avatar answered Sep 28 '22 19:09

David Eisenstat


Let's determine for every element an interval, where this element is dominating (maximum). We can do this in linear time with forward and backward runs using stack. Arrays L and R will contain indexes out of the domination interval.

To get right and left indexes:

Stack.Push(0) //(1st element index)
for i = 1 to Len - 1 do
     while Stack.Peek < X[i] do
         j = Stack.Pop
         R[j] = i   //j-th position is dominated by i-th one from the right
     Stack.Push(i)
while not Stack.Empty
   R[Stack.Pop] = Len  //the rest of elements are not dominated from the right

//now right to left
Stack.Push(Len - 1) //(last element index)
for i = Len - 2 to 0 do
     while Stack.Peek < X[i] do
         j = Stack.Pop
         L[j] = i   //j-th position is dominated by i-th one from the left
     Stack.Push(i)
while not Stack.Empty
   L[Stack.Pop] = -1  //the rest of elements are not dominated from the left

Result for (5,7,3,9,4) array.
For example, 7 dominates at 0..2 interval, 9 at 0..4

i  0   1   2   3   4 
X  5   7   3   9   4
R  1   3   3   5   5
L -1  -1   1  -1   4

Now for every element we can count it's impact in every possible sum.
Element 5 dominates at (0,0) interval, it is summed only in k=1 sum entry
Element 7 dominates at (0,2) interval, it is summed once in k=1 sum entry, twice in k=2 entry, once in k=3 entry.
Element 3 dominates at (2,2) interval, it is summed only in k=1 sum entry
Element 9 dominates at (0,4) interval, it is summed once in k=1 sum entry, twice in k=2, twice in k=3, twice in k=4, once in k=5.
Element 4 dominates at (4,4) interval, it is summed only in k=1 sum entry.

In general element with long domination interval in the center of long array may give up to k*Value impact in k-length sum (it depends on position relative to array ends and to another dom. elements)

k    1    2    3     4    5
 --------------------------
     5        
     7    2*7  7
     3    
     9    2*9  2*9   2*9  9
     4
 --------------------------
S(k) 28   32   25    18   9

Note that the sum of coefficients is N*(N-1)/2 (equal to the number of possible windows), the most of table entries are empty, so complexity seems better than O(N^2)
(I still doubt about exact complexity)

like image 31
MBo Avatar answered Sep 28 '22 19:09

MBo