Counting unique words in python

Tags:

word-count

In direct, my code so far is this :

from glob import glob
pattern = "D:\\report\\shakeall\\*.txt"
filelist = glob(pattern)
def countwords(fp):
    with open(fp) as fh:
        return len(fh.read().split())
print "There are" ,sum(map(countwords, filelist)), "words in the files. " "From directory",pattern

I want to add a code that counts unique words from pattern(42 txt files in this path) but I don't know how. Can anybody help me?

618

asked Aug 10 '12 10:08

2 Answers

The best way to count objects in Python is to use collections.Counter class, which was created for that purposes. It acts like a Python dict but is a bit easier in use when counting. You can just pass a list of objects and it counts them for you automatically.

>>> from collections import Counter
>>> c = Counter(['hello', 'hello', 1])
>>> print c
Counter({'hello': 2, 1: 1})

Also Counter has some useful methods like most_common, visit documentation to learn more.

One method of Counter class that can also be very useful is update method. After you've instantiated Counter by passing a list of objects, you can do the same using update method and it will continue counting without dropping old counters for objects:

>>> from collections import Counter
>>> c = Counter(['hello', 'hello', 1])
>>> print c
Counter({'hello': 2, 1: 1})
>>> c.update(['hello'])
>>> print c
Counter({'hello': 3, 1: 1})

answered Oct 07 '22 08:10

Rostyslav Dzinko

print len(set(w.lower() for w in open('filename.dat').read().split()))

Reads the entire file into memory, splits it into words using whitespace, converts each word to lower case, creates a (unique) set from the lowercase words, counts them and prints the output

answered Oct 07 '22 07:10

NIlesh Sharma

Related questions
                            
                                Installing MatPlotLib in windows
                            
                                getting syntax error near unexpected token `;' in python
                            
                                Find consecutive combinations [duplicate]
                            
                                TemplateNotFound: index.html with Google App Engine & Jinja2
                            
                                I'm using excel to build websites - Looking for an alternative
                            
                                using django-allauth
                            
                                Remove unwanted characters from phone number string
                            
                                Fast python front list extending
                            
                                Persisting data in sklearn
                            
                                python socket.error operation not permitted
                            
                                Unable to understand this python decorator
                            
                                how to write simultaneous subscript and superscript for a symbol with matplotlib
                            
                                bcrypt in python [closed]
                            
                                Using WordNet to determine semantic similarity between two texts?
                            
                                CryptoJS and Pycrypto working together
                            
                                Generate all leaf-to-root paths in a dictionary tree in Python
                            
                                Scrapy: Can't override __init__function
                            
                                How many function calls does it take to create a class instance? [closed]
                            
                                Identifying serial/usb device python
                            
                                Is it possible to flush memory on Heroku dynos?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Counting unique words in python

Tags:

python

word-count

rocksland

People also ask

2 Answers

Rostyslav Dzinko

NIlesh Sharma

Recent Activity

Donate For Us