Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python histogram one-liner

There are many ways to write a Python program that computes a histogram.

By histogram, I mean a function that counts the occurrence of objects in an iterable and outputs the counts in a dictionary. For example:

>>> L = 'abracadabra'
>>> histogram(L)
{'a': 5, 'b': 2, 'c': 1, 'd': 1, 'r': 2}

One way to write this function is:

def histogram(L):
    d = {}
    for x in L:
        if x in d:
            d[x] += 1
        else:
            d[x] = 1
    return d

Are there more concise ways of writing this function?

If we had dictionary comprehensions in Python, we could write:

>>> { x: L.count(x) for x in set(L) }

but since Python 2.6 doesn't have them, we have to write:

>>> dict([(x, L.count(x)) for x in set(L)])

Although this approach may be readable, it is not efficient: L is walked-through multiple times. Furthermore, this won't work for single-life generators; the function should work equally well for iterator generators such as:

def gen(L):
    for x in L:
        yield x

We might try to use the reduce function (R.I.P.):

>>> reduce(lambda d,x: dict(d, x=d.get(x,0)+1), L, {}) # wrong!

Oops, this does not work: the key name is 'x', not x. :(

I ended with:

>>> reduce(lambda d,x: dict(d.items() + [(x, d.get(x, 0)+1)]), L, {})

(In Python 3, we would have to write list(d.items()) instead of d.items(), but it's hypothethical, since there is no reduce there.)

Please beat me with a better, more readable one-liner! ;)

like image 342
mykhal Avatar asked May 20 '10 01:05

mykhal


People also ask

How do you make a histogram beautiful in Python?

We can achieve this by increasing the number of bins, which is essentially the number of classes the histogram divides the data into. More bins will make the histogram smoother.

How do you normalize a histogram in Python?

To normalize a histogram in Python, we can use hist() method. In normalized bar, the area underneath the plot should be 1.


3 Answers

Python 3.x does have reduce, you just have to do a from functools import reduce. It also has "dict comprehensions", which have exactly the syntax in your example.

Python 2.7 and 3.x also have a Counter class which does exactly what you want:

from collections import Counter
cnt = Counter("abracadabra")

In Python 2.6 or earlier, I'd personally use a defaultdict and do it in 2 lines:

d = defaultdict(int)
for x in xs: d[x] += 1

That's clean, efficient, Pythonic, and much easier for most people to understand than anything involving reduce.

like image 160
Eli Courtwright Avatar answered Sep 21 '22 03:09

Eli Courtwright


It's kinda cheaty to import modules for oneliners, so here's a oneliner that is O(n) and works at least as far back as Python2.4

>>> f=lambda s,d={}:([d.__setitem__(i,d.get(i,0)+1) for i in s],d)[-1]
>>> f("ABRACADABRA")
{'A': 5, 'R': 2, 'B': 2, 'C': 1, 'D': 1}

And if you think __ methods are hacky, you can always do this

>>> f=lambda s,d=lambda:0:vars(([setattr(d,i,getattr(d,i,0)+1) for i in s],d)[-1])
>>> f("ABRACADABRA")
{'A': 5, 'R': 2, 'B': 2, 'C': 1, 'D': 1}

:)

like image 34
John La Rooy Avatar answered Sep 25 '22 03:09

John La Rooy


$d{$_} += 1 for split //, 'abracadabra';
like image 37
perl Avatar answered Sep 21 '22 03:09

perl