python - prefix sum algorithm

Tags:

I am trying to grasp the idea behind the prefix sum concept looking at the example presented in the Prefix Sum Lesson by Codility here (The mushroom picker problem)

My understanding is that the whole concept is based on the simple property where for finding a sum of all elements between two positions A(pos_left, pos_right) of an array A a second array P is used where all elements are consecutively summed and where the searched sum is calculated as
value(P(pos_right + 1)) - value(P(pos_left)).

Click to copy

A 1 2 3 4 5  6
P 0 1 3 6 10 15 21
sum of all elements between A[2] and A[5] = 3+ 4 + 5 = 12
or using the prefix sums"   P[5+1] - P[2] = 15 -3 = 12

The problem
There is a street with mushroom at every place represented by a non-empty vector. Given the initial position of a picker and its movement range, possible maximum number of mushrooms to collect is looked for.

Looking at the example I don't understand the logic behind the constuction of the loops. Can anybody clarify the mechanics of this algorithm?

Secondly, I found the positoin indexing in this example very confusing and cumbersome. Is it common practise to "shift" the vector with prefix sums with the zero in the begining? (the fact that counting elements in vectors start by defualt from 0 in python causes already some confusion).

The solution

Click to copy

def prefix_sums(A):
  n = len(A)
  P = [0] * (n + 1)
  for k in xrange(1, n + 1):
      P[k] = P[k - 1] + A[k - 1]
  return P


def count_total(P, x, y):
    return P[y + 1] - P[x]

# A mushroom picker is at spot number k on the road and should perform m moves
def mushrooms(A, k, m):
    n = len(A)
    result = 0
    pref = prefix_sums(A)
    for p in xrange(min(m, k) + 1):   # going left
        left_pos = k - p
        right_pos = min(n - 1, max(k, k + m - 2 * p))
        result = max(result, count_total(pref, left_pos, right_pos))
    for p in xrange(min(m + 1, n - k)):
        right_pos = k + p
        left_pos = max(0, min(k, k - (m - 2 * p)))
        result = max(result, count_total(pref, left_pos, right_pos))
    return result

I have run some example for a small array A= [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20] , chose the position k=5 and the range m = 3. I don't understand the logic of creating the ranges to check by the two loops.

I get the following parameters for the loops

Click to copy

(p=, left_pos=, right_pos=)   
loop 1  (0,5,8), (1,4,6),(2,3,5),(3,2,5)
loop 2  (0,2,5), (1,4,6), (2,5,7), (3,5,8)

The rangies vary. Why?

version for debugging

Click to copy

def mushrooms2(A, k, m):
    n = len(A)
    result = 0
    pref = prefix_sums(A)
    l1 =min(m, k) + 1
    print 'loop p in xrange(min(m, k) + 1): %d' % l1
    for p in xrange(min(m, k) + 1):
        print 'p %d' % p
        print 'A= %r' % A
        print 'pref= %r' % pref
        left_pos = k - p
        right_pos = min(n - 1, max(k, k + m - 2 * p))
        result = max(result, count_total(pref, left_pos, right_pos))
        print 'left_pos = k - p= %d' % left_pos
        print 'right_pos= min(n-1,max(k,k+m-2*p))= %d' % right_pos
        print 'max'
        print '(result %d' % result
        print 'count_total(pref, left_pos, right_pos)) %r, %r, %r, %r' % (pref,left_pos, right_pos,count_total(pref, left_pos, right_pos))
        print 'result= %d' % result
        print 'next p'
    l2=min(m + 1, n - k)
    print   'loop xrange(min(m + 1, n - k)): %d' % l2
    for p in xrange(min(m + 1, n - k)):
        print 'p %d' % p
        print 'A= %r' % A
        print 'pref= %r' % pref
        right_pos = k + p
        left_pos = max(0, min(k, k - (m - 2 * p)))
        result = max(result, count_total(pref, left_pos, right_pos))
        print 'right_pos = k + p= %d' % right_pos
        print 'left_pos = max(0, min(k, k - (m - 2 * p)))= %d' % left_pos
        print 'max'
        print '(result %d' % result
        print 'count_total(pref, left_pos, right_pos)) %r, %r, %r, %r' % (pref,left_pos, right_pos,count_total(pref, left_pos, right_pos))
        print 'result= %d' % result
        print 'next p'
    print 'result %d' % result
    return result

288

asked Oct 31 '16 03:10

Chris

1 Answers

You are not alone in considering the loop construction to be counter-intuitive, as I had to spend a few minutes on it as well. Here's what I figured out.

Now, the solution in the link you provided further details the optimal strategy is walking on path in such a way that one changes directions only once. In that manner, one is able to cover a range with left and right endpoints, which left_pos and right_pos seems to represent.

As to the particulars of the loops, instead of thinking of the loop in terms of the loop variables(i.e. p) it is easier to figure out what changes through the course of the loop, and how p is used. Otherwise, figuring out what is in those min and max expressions seems a bit too peculiar in the beginning.

For instance, in the first loop, instead of figuring out what that range represents, try how left_pos is affected by different values p gets. After a bit of thinking, one notices that left_pos changes in a manner complying to the possible left endpoints.

Specifically, when p == 0, left endpoint is the starting index(i.e. k) and when p is min(m, k), then it is either 0(i.e. if k < m) or (k - m). In the former case, that is as far as the left endpoint can go, as it would get out of the valid range of spots on the road. In the latter case, the number of moves prohibit any solution with a left_pos smaller than (k - m) since it is impossible to go from k to those indices in m moves.

The assignment made to right_pos in the first loop can be explained similarly. min statement includes (n-1), which is the rightmost legal index that can be reached and it serves to keep the right endpoint in the allowed limits. The inner max statement features k, as it is the least possible value for right_pos. (i.e. due to k being the starting point) It also has an expression (k + m - 2 * p). This expression represents the following process:

Go to left for p moves.
Change direction, and go to right for p moves to reach the starting point.
Go to right with the remaining (m - 2p) moves.

The second loop is just the reflection of this first loop, and you may explain it simply by adapting my explanation of the first loop.

As to your second question, I do not think it is common practice to shift the indices for prefix sum arrays. I typically use this method in online programming contests and my implementation of the prefix sum array you use in Python would be as follows.

Click to copy

def prefix_sums(A):
    n = len(A)
    P = [0] * n
    P[0] = A[0]
    for k in xrange(1, n):
        P[k] = P[k - 1] + A[k]
    return P

def count_total(P, x, y):
    return (P[y] - P[x - 1] if x > 0 else P[y])

The fundamental idea behind the implementation above is that, at P[x], we have the sum A[0] + A[1] + ... + A[x].

154

answered Sep 17 '22 14:09

ilim

Related questions
                            
                                Read merged cells in Excel with Python
                            
                                Python: issue when using vars() dictionary
                            
                                Plotting Histogram with given x and y values
                            
                                argsort for a multidimensional ndarray
                            
                                How to detect bullet holes on the target
                            
                                Django model one foreign key to many tables
                            
                                Insert 0s into 2d array
                            
                                What's the difference between apt-get virtualenv and pip virtualenv?
                            
                                Argparse with two values for one argument
                            
                                How to have python code and markdown in one cell
                            
                                FileNotFoundError: [WinError 2] The system cannot find the file specified:
                            
                                Why does python/numpy's += mutate the original array?
                            
                                Set weight and bias tensors of tensorflow conv2d operation
                            
                                How to get the count of an element in a tensor in TensorFlow?
                            
                                Add trend line to pandas
                            
                                pandas.DataFrame corrwith() method
                            
                                AttributeError: 'Figure' object has no attribute 'plot'
                            
                                How to get confidence intervals from curve_fit
                            
                                Validating a Django model field based on another field's value?
                            
                                'ImportError: No module named dumbnet' when trying to run a script that leverages scapy on OS X

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python - prefix sum algorithm

Tags:

python

algorithm

prefix-sum

Chris

People also ask

1 Answers

ilim

Recent Activity

Donate For Us