Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding least common elements in a list

Tags:

python

list

I want to generate an ordered list of the least common words within a large body of text, with the least common word appearing first along with a value indicating how many times it appears in the text.

I scraped the text from some online journal articles, then simply assigned and split;

article_one = """ large body of text """.split() 
=> ("large","body", "of", "text")

Seems like a regex would be appropriate for the next steps, but being new to programming I'm not well versed- If the best answer includes a regex, could someone point me to a good regex tutorial other than pydoc?

like image 559
Benjamin James Avatar asked Jan 31 '13 01:01

Benjamin James


People also ask

How do you find the least common in Python?

For two integers a and b, denoted LCM(a,b), the LCM is the smallest positive integer that is evenly divisible by both a and b. For example, LCM(2,3) = 6 and LCM(6,10) = 30.

How does counter work in Python?

Counter is a subclass of dict that's specially designed for counting hashable objects in Python. It's a dictionary that stores objects as keys and counts as values. To count with Counter , you typically provide a sequence or iterable of hashable objects as an argument to the class's constructor.


2 Answers

How about a shorter/simpler version with a defaultdict, Counter is nice but needs Python 2.7, this works from 2.5 and up :)

import collections

counter = collections.defaultdict(int)
article_one = """ large body of text """

for word in article_one.split():
    counter[word] += 1

print sorted(counter.iteritems(), key=lambda x: x[::-1])
like image 163
Wolph Avatar answered Oct 20 '22 03:10

Wolph


Finding least common elements in a list. According to Counter class in Collections module

c.most_common()[:-n-1:-1]       # n least common elements

So Code for least common element in list is

from collections import Counter
Counter( mylist ).most_common()[:-2:-1]

Two least common elements is

from collections import Counter
Counter( mylist ).most_common()[:-3:-1]

python-3.x

like image 25
dwalsh84 Avatar answered Oct 20 '22 01:10

dwalsh84