Python iterate through array while finding the mean of the top k elements

Question

Suppose I have a Python array a=[3, 5, 2, 7, 5, 3, 6, 8, 4]. My goal is to iterate through this array 3 elements at a time returning the mean of the top 2 of the three elements.

Using the above array, during my iteration step, the first three elements are [3, 5, 2] and the mean of the top 2 elements is 4. The next three elements are [5, 2, 7] and the mean of the top 2 elements is 6. The next three elements are [2, 7, 5] and the mean of the top 2 elements is again 6. ...

Hence, the result for the above array would be [4, 6, 6, 6, 5.5, 7, 7].

What is the nicest way to write such a function?

foslock · Accepted Answer

Solution

You can use some fancy slicing of your list to manipulate subsets of elements. Simply grab each three element sublist, sort to find the top two elements, and then find the simple average (aka. mean) and add it to a result list.

Code

def get_means(input_list):
    means = []
    for i in xrange(len(input_list)-2):
        three_elements = input_list[i:i+3]
        sum_top_two = sum(three_elements) - min(three_elements)
        means.append(sum_top_two/2.0)
    return means

Example

You can see your example input (and desired result) like so:

print(get_means([3, 5, 2, 7, 5, 3, 6, 8, 4]))
# [4.0, 6.0, 6.0, 6.0, 5.5, 7.0, 7.0]

And more...

There are some other great answers that get into more performance directed answers, including one using a generator to avoid large in memory lists: https://stackoverflow.com/a/49001728/416500

Maarten Fabré · Answer

I believe in splitting the code in 2 parts. Here that would be getting the sliding window, getting the top 2 elements, and calculating the mean. cleanest way to do this is using generators

Sliding window

Slight variation on evamicur's answer using tee, islice and zip to create the window:

def windowed_iterator(iterable, n=2):
    iterators = itertools.tee(iterable, n)
    iterators = (itertools.islice(it, i, None) for i, it in enumerate(iterators))
    yield from zip(*iterators)

windows = windowed_iterator(iterable=a, n=3)

[(3, 5, 2), (5, 2, 7), (2, 7, 5), (7, 5, 3), (5, 3, 6), (3, 6, 8), (6, 8, 4)]

top 2 elements

to calculate the mean of the 2 highest you can use any of the methods used in the other answers, I think the heapq on is the clearest

from heapq import nlargest
top_n = map(lambda x: nlargest(2, x), windows)

or equivalently

top_n = (nlargest(2, i) for i in windows)

[[5, 3], [7, 5], [7, 5], [7, 5], [6, 5], [8, 6], [8, 6]]

mean

from statistics import mean
means = map(mean, top_n)

[4, 6, 6, 6, 5.5, 7, 7]

Python iterate through array while finding the mean of the top k elements

Tags:

python

algorithm

Student

2 Answers

Solution

Code

Example

And more...

foslock

Sliding window

top 2 elements

mean

Maarten Fabré

Recent Activity

Donate For Us

Python iterate through array while finding the mean of the top k elements

Tags:

python

algorithm

Student

2 Answers

Solution

Code

Example

And more...

foslock

Sliding window

top 2 elements

mean

Maarten Fabré

Related questions

Recent Activity

Donate For Us