Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding counters deletes keys

Tags:

See below, why does the implementation of += blow away a key in my original counter?

>>> c = Counter({'a': 0, 'b': 0, 'c': 0})
>>> c.items()
[('a', 0), ('c', 0), ('b', 0)]
>>> c += Counter('abba')
>>> c.items()
[('a', 2), ('b', 2)]

I think that's impolite to say the least, there is quite a difference between "X was counted 0 times" and "we aren't even counting Xs". It seems like collections.Counter is not a counter at all, it's more like a multiset.

But counters are a subclass of dict and we're allowed to construct them with zero or negative values: Counter(a=0, b=-1). If it's actually a "bag of things", wouldn't this be prohibited, restricting init to accept an iterable of hashable items?

To further confuse matters, counter implements update and subtract methods which have different behaviour to + and - operators. It seems like this class is having an identity crisis!

Is a Counter a dict or a bag?

like image 729
wim Avatar asked Feb 19 '14 16:02

wim


1 Answers

Counters are a kind of multiset. From the Counter() documentation:

Several mathematical operations are provided for combining Counter objects to produce multisets (counters that have counts greater than zero). Addition and subtraction combine counters by adding or subtracting the counts of corresponding elements. Intersection and union return the minimum and maximum of corresponding counts. Each operation can accept inputs with signed counts, but the output will exclude results with counts of zero or less.

Emphasis mine.

Further on it tells you gives you some more detail about the multiset nature of Counters:

Note: Counters were primarily designed to work with positive integers to represent running counts; however, care was taken to not unnecessarily preclude use cases needing other types or negative values. To help with those use cases, this section documents the minimum range and type restrictions.

[...]

  • The multiset methods are designed only for use cases with positive values. The inputs may be negative or zero, but only outputs with positive values are created. There are no type restrictions, but the value type needs to support addition, subtraction, and comparison.

So Counter objects are both; dictionaries and bags. Standard dictionaries, however, don't support addition, but Counters do, so it's not as if Counters are breaking a precedence set by dictionaries here.

If you wanted to retain the zeros, use Counter.update() and pass in the result of Counter.elements() of the other object:

c.update(Counter('abba').elements())

Demo:

>>> c = Counter({'a': 0, 'b': 0, 'c': 0})
>>> c.update(Counter('abba').elements())
>>> c
Counter({'a': 2, 'b': 2, 'c': 0})
like image 100
Martijn Pieters Avatar answered Sep 27 '22 20:09

Martijn Pieters