Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to calculate the mean and variance of values in Counters

I'm wondering whether there is a Pythonic way to compute the means and variances of Counters?

For example, I have four Counters sharing the same keys:

a = Counter({1: 23, 2: 39, 3: 1})
b = Counter({1: 28, 2: 39, 3: 1})
c = Counter({1: 23, 2: 39, 3: 2})
d = Counter({1: 23, 2: 22, 3: 1})

My way to do that is:

each_key_val = {}

for i in a.keys():  # The assumption here is that all Counters must share the same keys
    for j in [a, b, c, d]:
        try:
            each_key_val[i].append(j[i])       
        except:
            each_key_val[i] = [j[i]]

I could use the following code to find the mean / variance for each key:

 np.mean(each_key_val[i])
 np.var(each_key_val[i])

Is there an easier way to compute the mean / variance for each key compared to my way?

like image 622
datadatadata Avatar asked Oct 19 '22 07:10

datadatadata


1 Answers

It's not that I think the following is more readable than what you have, but it only uses list comprehensions.

Say you have

cs = (a, b, c, d)

Then a dictionary of the mean can be found with

m = {k: float(d) / len(cs) for k, d in sum(cs).iteritems()}

For the variance, note that, by the definition of variance V[X] = E[x2] - (E[X])2, so, if you define:

p = sum([Counter({k: ((float(d**2) / len(cs))) for (k, d) in cn.iteritems()}) \
     for cn in cs])

then the variance dictionary is

{k: p[k] - m[k]**2 for k in m}
like image 137
Ami Tavory Avatar answered Oct 22 '22 00:10

Ami Tavory