Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running average in Python

Is there a pythonic way to build up a list that contains a running average of some function?

After reading a fun little piece about Martians, black boxes, and the Cauchy distribution, I thought it would be fun to calculate a running average of the Cauchy distribution myself:

import math 
import random

def cauchy(location, scale):
    p = 0.0
    while p == 0.0:
        p = random.random()
    return location + scale*math.tan(math.pi*(p - 0.5))

# is this next block of code a good way to populate running_avg?
sum = 0
count = 0
max = 10
running_avg = []
while count < max:
    num = cauchy(3,1)
    sum += num
    count += 1
    running_avg.append(sum/count)

print running_avg     # or do something else with it, besides printing

I think that this approach works, but I'm curious if there might be a more elegant approach to building up that running_avg list than using loops and counters (e.g. list comprehensions).

There are some related questions, but they address more complicated problems (small window size, exponential weighting) or aren't specific to Python:

  • calculate exponential moving average in python
  • How to efficiently calculate a running standard deviation?
  • Calculating the Moving Average of a List
like image 839
Nate Kohl Avatar asked Nov 24 '09 14:11

Nate Kohl


2 Answers

You could write a generator:

def running_average():
  sum = 0
  count = 0
  while True:
    sum += cauchy(3,1)
    count += 1
    yield sum/count

Or, given a generator for Cauchy numbers and a utility function for a running sum generator, you can have a neat generator expression:

# Cauchy numbers generator
def cauchy_numbers():
  while True:
    yield cauchy(3,1)

# running sum utility function
def running_sum(iterable):
  sum = 0
  for x in iterable:
    sum += x
    yield sum

# Running averages generator expression (** the neat part **)
running_avgs = (sum/(i+1) for (i,sum) in enumerate(running_sum(cauchy_numbers())))

# goes on forever
for avg in running_avgs:
  print avg

# alternatively, take just the first 10
import itertools
for avg in itertools.islice(running_avgs, 10):
  print avg
like image 139
orip Avatar answered Oct 02 '22 15:10

orip


You could use coroutines. They are similar to generators, but allows you to send in values. Coroutines was added in Python 2.5, so this won't work in versions before that.

def running_average():
    sum = 0.0
    count = 0
    value = yield(float('nan'))
    while True:
        sum += value
        count += 1
        value = yield(sum/count)

ravg = running_average()
next(ravg)   # advance the corutine to the first yield

for i in xrange(10):
    avg = ravg.send(cauchy(3,1))
    print 'Running average: %.6f' % (avg,)

As a list comprehension:

ravg = running_average()
next(ravg)
ravg_list = [ravg.send(cauchy(3,1)) for i in xrange(10)]

Edits:

  • Using the next() function instead of the it.next() method. This is so it also will work with Python 3. The next() function has also been back-ported to Python 2.6+.
    In Python 2.5, you can either replace the calls with it.next(), or define a next function yourself.
    (Thanks Adam Parkin)
like image 41
Markus Jarderot Avatar answered Oct 02 '22 16:10

Markus Jarderot