Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python collections.Counter() runtime

Tags:

python

counter

I just run into a problem that I need to put a list, e.g. l = [1, 2, 3, 4], into a dic, e.g. {1: 1, 2: 1, 3: 1, 4: 1}. I just want to know whether I should use collections.Counter() or just write a loop by myself to do this. Is build-in method faster than writing loop by myself?

like image 647
Sharon Tan Avatar asked Oct 24 '25 14:10

Sharon Tan


1 Answers

You can always test if something is faster, with the timeit module. In Python 3, the Counter object has C performance improvements and is very fast indeed:

>>> from timeit import timeit
>>> import random, string
>>> from collections import Counter, defaultdict
>>> def count_manually(it):
...     res = defaultdict(int)
...     for el in it:
...         res[el] += 1
...     return res
...
>>> test_data = [random.choice(string.printable) for _ in range(10000)]
>>> timeit('count_manually(test_data)', 'from __main__ import test_data, count_manually', number=2000)
1.4321454349992564

>>> timeit('Counter(test_data)', 'from __main__ import test_data, Counter', number=2000)
0.776072466003825

Here Counter() was 2 times faster.

That said, unless you are counting in a performance-critical section of your code, focus on readability and maintainability in mind, and in that respect a Counter() wins hands-down over write-your-own code.

Next to all that, Counter() objects offer functionality on top of dictionaries: they can be treated as multisets (you can sum or subtract counters, and produce unions or intersections), and they can efficiently give you the top N elements by count.

like image 141
Martijn Pieters Avatar answered Oct 27 '25 03:10

Martijn Pieters