I want to generate an ordered list of the least common words within a large body of text, with the least common word appearing first along with a value indicating how many times it appears in the text.
I scraped the text from some online journal articles, then simply assigned and split;
article_one = """ large body of text """.split()
=> ("large","body", "of", "text")
Seems like a regex would be appropriate for the next steps, but being new to programming I'm not well versed- If the best answer includes a regex, could someone point me to a good regex tutorial other than pydoc?
For two integers a and b, denoted LCM(a,b), the LCM is the smallest positive integer that is evenly divisible by both a and b. For example, LCM(2,3) = 6 and LCM(6,10) = 30.
Counter is a subclass of dict that's specially designed for counting hashable objects in Python. It's a dictionary that stores objects as keys and counts as values. To count with Counter , you typically provide a sequence or iterable of hashable objects as an argument to the class's constructor.
How about a shorter/simpler version with a defaultdict, Counter is nice but needs Python 2.7, this works from 2.5 and up :)
import collections
counter = collections.defaultdict(int)
article_one = """ large body of text """
for word in article_one.split():
counter[word] += 1
print sorted(counter.iteritems(), key=lambda x: x[::-1])
Finding least common elements in a list. According to Counter class in Collections module
c.most_common()[:-n-1:-1] # n least common elements
So Code for least common element in list is
from collections import Counter
Counter( mylist ).most_common()[:-2:-1]
Two least common elements is
from collections import Counter
Counter( mylist ).most_common()[:-3:-1]
python-3.x
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With