Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating an Ordered Counter

I've been reading into how super() works. I came across this recipe that demonstrates how to create an Ordered Counter:

from collections import Counter, OrderedDict

class OrderedCounter(Counter, OrderedDict):
     'Counter that remembers the order elements are first seen'
     def __repr__(self):
         return '%s(%r)' % (self.__class__.__name__,
                            OrderedDict(self))
     def __reduce__(self):
         return self.__class__, (OrderedDict(self),)

For example:

oc = OrderedCounter('adddddbracadabra')

print(oc)

OrderedCounter(OrderedDict([('a', 5), ('d', 6), ('b', 2), ('r', 2), ('c', 1)]))

Is someone able to explain how this magically works?

This also appears in the Python documentation.

like image 289
Sean Avatar asked Feb 17 '16 00:02

Sean


People also ask

What is ordered counter in Python?

When an instance of an OrderedDict is calling __setitem__() , it searches the classes in order: OrderedCounter , Counter , OrderedDict (where it is found). So an statement like oc['a'] = 0 ends up calling OrderedDict.

What does counter () do in Python?

Counter is a subclass of dict that's specially designed for counting hashable objects in Python. It's a dictionary that stores objects as keys and counts as values. To count with Counter , you typically provide a sequence or iterable of hashable objects as an argument to the class's constructor.

What is counter collection?

Counter is an unordered collection where elements are stored as Dict keys and their count as dict value. Counter elements count can be positive, zero or negative integers. However there is no restriction on it's keys and values.


2 Answers

OrderedCounter is given as an example in the OrderedDict documentation, and works without needing to override any methods:

class OrderedCounter(Counter, OrderedDict):
    pass

When a class method is called, Python has to find the correct method to execute. There is a defined order in which it searches the class hierarchy called the "method resolution order" or mro. The mro is stored in the attribute __mro__:

OrderedCounter.__mro__

(<class '__main__.OrderedCounter'>, <class 'collections.Counter'>, <class 'collections.OrderedDict'>, <class 'dict'>, <class 'object'>)

When an instance of an OrderedDict is calling __setitem__(), it searches the classes in order: OrderedCounter, Counter, OrderedDict (where it is found). So an statement like oc['a'] = 0 ends up calling OrderedDict.__setitem__().

In contrast, __getitem__ is not overridden by any of the subclasses in the mro, so count = oc['a'] is handled by dict.__getitem__().

oc = OrderedCounter()    
oc['a'] = 1             # this call uses OrderedDict.__setitem__
count = oc['a']         # this call uses dict.__getitem__

A more interesting call sequence occurs for a statement like oc.update('foobar'). First, Counter.update() gets called. The code for Counter.update() uses self[elem], which gets turned into a call to OrderedDict.__setitem__(). And the code for that calls dict.__setitem__().

If the base classes are reversed, it no longer works. Because the mro is different and the wrong methods get called.

class OrderedCounter(OrderedDict, Counter):   # <<<== doesn't work
    pass

More info on mro can be found in the Python 2.3 documentation.

like image 182
RootTwo Avatar answered Oct 10 '22 02:10

RootTwo


I think we need to represent those methods repr and reduce in the class when words are given as input.

Without repr and reduce:

from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
    pass

oc = OrderedCounter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
print(oc)

Output:

OrderedCounter({'apple': 2, 'mango': 2, 'banana': 1, 'cherry': 1, 'pie': 1})

The order in the above example is not preserved.

With repr and reduce:

from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
    'Counter that remembers the order elements are first encountered'
    def __repr__(self):
        return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))

    def __reduce__(self):
        return self.__class__, (OrderedDict(self),)
oc = OrderedCounter(['apple', 'banana', 'cherry', 'mango', 'apple', 'pie', 'mango'])
print(oc)

Output:

OrderedCounter(OrderedDict([('apple', 2), ('banana', 1), ('cherry', 1), ('mango', 2), ('pie', 1)]))
like image 1
mhpd Avatar answered Oct 10 '22 02:10

mhpd