I want to write this code as pythonic. My real array much bigger than this example.
( 5+10+20+3+2 ) / 5
print(np.mean(array,key=lambda x:x[1])) TypeError: mean() got an unexpected keyword argument 'key'
array = [('a', 5) , ('b', 10), ('c', 20), ('d', 3), ('e', 2)]
sum = 0
for i in range(len(array)):
sum = sum + array[i][1]
average = sum / len(array)
print(average)
import numpy as np
print(np.mean(array,key=lambda x:x[1]))
How can avoid this? I want to use second example.
I'm using Python 3.7
Calculate Average of a Tuple We can use the statistics. mean() method to calculate the mean value of a tuple, which is an unchangeable, ordered sequence of items.
Using sum() In Python we can find the average of a list by simply using the sum() and len() function. Click here for the Complete Course! sum() : Using sum() function we can get the sum of the list. len() : len() function is used to get the length or the number of elements in a list.
How to Calculate Average. The average of a set of numbers is simply the sum of the numbers divided by the total number of values in the set. For example, suppose we want the average of 24 , 55 , 17 , 87 and 100 . Simply find the sum of the numbers: 24 + 55 + 17 + 87 + 100 = 283 and divide by 5 to get 56.6 .
If you are using Python 3.4 or above, you could use the statistics
module:
from statistics import mean
average = mean(value[1] for value in array)
Or if you're using a version of Python older than 3.4:
average = sum(value[1] for value in array) / len(array)
These solutions both use a nice feature of Python called a generator expression. The loop
value[1] for value in array
creates a new sequence in a timely and memory efficient manner. See PEP 289 -- Generator Expressions.
If you're using Python 2, and you're summing integers, we will have integer division, which will truncate the result, e.g:
>>> 25 / 4
6
>>> 25 / float(4)
6.25
To ensure we don't have integer division we could set the starting value of sum
to be the float
value 0.0
. However, this also means we have to make the generator expression explicit with parentheses, otherwise it's a syntax error, and it's less pretty, as noted in the comments:
average = sum((value[1] for value in array), 0.0) / len(array)
It's probably best to use fsum
from the math
module which will return a float
:
from math import fsum
average = fsum(value[1] for value in array) / len(array)
If you do want to use numpy
, cast it to a numpy.array
and select the axis you want using numpy
indexing:
import numpy as np
array = np.array([('a', 5) , ('b', 10), ('c', 20), ('d', 3), ('e', 2)])
print(array[:,1].astype(float).mean())
# 8.0
The cast to a numeric type is needed because the original array contains both strings and numbers and is therefore of type object
. In this case you could use float
or int
, it makes no difference.
If you're open to more golf-like solutions, you can transpose your array with vanilla python, get a list of just the numbers, and calculate the mean with
sum(zip(*array)[1])/len(array)
With pure Python:
from operator import itemgetter
acc = 0
count = 0
for value in map(itemgetter(1), array):
acc += value
count += 1
mean = acc / count
An iterative approach can be preferable if your data cannot fit in memory as a list
(since you said it was big). If it can, prefer a declarative approach:
data = [sub[1] for sub in array]
mean = sum(data) / len(data)
If you are open to using numpy
, I find this cleaner:
a = np.array(array)
mean = a[:, 1].astype(int).mean()
you can use map
instead of list comprehension
sum(map(lambda x:int(x[1]), array)) / len(array)
or functools.reduce
(if you use Python2.X just reduce
not functools.reduce
)
import functools
functools.reduce(lambda acc, y: acc + y[1], array, 0) / len(array)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With