I have a counter declared as: main_dict = Counter()
and values are added as main_dict[word] += 1
. In the end I want to remove all the elements less than 15 in frequency. Is there any function in Counters
to do this.
Any help appreciated.
Counter is an unordered collection where elements are stored as Dict keys and their count as dict value. Counter elements count can be positive, zero or negative integers. However there is no restriction on it's keys and values. Although values are intended to be numbers but we can store other objects too.
The Counter holds the data in an unordered collection, just like hashtable objects. The elements here represent the keys and the count as values. It allows you to count the items in an iterable list. Arithmetic operations like addition, subtraction, intersection, and union can be easily performed on a Counter.
If a value has not been seen in the input, its count is 0 (like for unknown item e & f in above output). The elements() method returns an iterator that produces all of the items known to the Counter.
Size of the Counter is len and this has O (1) access. Also in beginning of your try to describe what you want to know more deeply "I want to know how many items are in a Python Counter leads to the same answer: len (c).
>>> from collections import Counter >>> counter = Counter({'baz': 20, 'bar': 15, 'foo': 10}) >>> Counter({k: c for k, c in counter.items() if c >= 15}) Counter({'baz': 20, 'bar': 15})
No, you'll need to remove them manually. Using itertools.dropwhile()
makes that a little easier perhaps:
from itertools import dropwhile for key, count in dropwhile(lambda key_count: key_count[1] >= 15, main_dict.most_common()): del main_dict[key]
Demonstration:
>>> main_dict Counter({'baz': 20, 'bar': 15, 'foo': 10}) >>> for key, count in dropwhile(lambda key_count: key_count[1] >= 15, main_dict.most_common()): ... del main_dict[key] ... >>> main_dict Counter({'baz': 20, 'bar': 15})
By using dropwhile
you only need to test the keys for which the count is 15 or over; after that it'll forgo testing and just pass through everything. That works great with the sorted most_common()
list. If there are a lot of values below 15, that saves execution time for all those tests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With