For example, I need to count how many times a word appears in a list, not sorted by frequency but with the order in which the words appear, i.e. insertion order.
from collections import Counter
words = ['oranges', 'apples', 'apples', 'bananas', 'kiwis', 'kiwis', 'apples']
c = Counter(words)
print(c)
So instead of: {'apples': 3, 'kiwis': 2, 'bananas': 1, 'oranges': 1}
I'd rather get: {'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}
And I don't really need this Counter
method, any way that will produce correct result is OK for me.
You can use the recipe that uses collections.Counter
and collections.OrderedDict
:
from collections import Counter, OrderedDict
class OrderedCounter(Counter, OrderedDict):
'Counter that remembers the order elements are first encountered'
def __repr__(self):
return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))
def __reduce__(self):
return self.__class__, (OrderedDict(self),)
words = ["oranges", "apples", "apples", "bananas", "kiwis", "kiwis", "apples"]
c = OrderedCounter(words)
print(c)
# OrderedCounter(OrderedDict([('oranges', 1), ('apples', 3), ('bananas', 1), ('kiwis', 2)]))
On Python 3.6+, dict
will now maintain insertion order.
So you can do:
words = ["oranges", "apples", "apples", "bananas", "kiwis", "kiwis", "apples"]
counter={}
for w in words: counter[w]=counter.get(w, 0)+1
>>> counter
{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}
Unfortunately, the Counter in Python 3.6 and 3.7 does not display the insertion order that it maintains; instead, __repr__
sorts the return by the most to least common.
But you can use the same OrderedDict recipe but just use the Python 3.6+ dict instead:
from collections import Counter
class OrderedCounter(Counter, dict):
'Counter that remembers the order elements are first encountered'
def __repr__(self):
return '%s(%r)' % (self.__class__.__name__, dict(self))
def __reduce__(self):
return self.__class__, (dict(self),)
>>> OrderedCounter(words)
OrderedCounter({'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2})
Or, since Counter is a subclass of dict
that maintains order in Python 3.6+, you can just avoid using Counter's __repr__
by either calling .items()
on the counter or turning the counter back into a dict
:
>>> c=Counter(words)
This presentation of that Counter is sorted by most common element to least and uses Counters __repr__
method:
>>> c
Counter({'apples': 3, 'kiwis': 2, 'oranges': 1, 'bananas': 1})
This presentation is as encountered, or insertion order:
>>> c.items()
dict_items([('oranges', 1), ('apples', 3), ('bananas', 1), ('kiwis', 2)])
Or,
>>> dict(c)
{'oranges': 1, 'apples': 3, 'bananas': 1, 'kiwis': 2}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With