Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

word frequency in python not working

I am trying to count frequencies of words in a text file using python.

I am using the following code:

openfile=open("total data", "r")

linecount=0
for line in openfile:
    if line.strip():
        linecount+=1

count={}

while linecount>0:
    line=openfile.readline().split()
    for word in line:
        if word in count:
            count[word]+=1
        else:
            count[word]=1
    linecount-=1

print count

But i get an empty dictionary. "print count" gives {} as output

I also tried using:

from collections import defaultdict
.
.
count=defaultdict(int)
.
.
     if word in count:
          count[word]=count.get(word,0)+1

But i'm getting an empty dictionary again. I dont understand what am i doing wrong. Could someone please point out?

like image 617
nish Avatar asked Dec 21 '22 03:12

nish


1 Answers

This loop for line in openfile: moves the file pointer at the end of the file. So, if you want to read the data again then either move the pointer(openfile.seek(0)) to the start of the file or re-open the file.

To get the word frequency better use Collections.Counter:

from collections import Counter
with open("total data", "r") as openfile:
   c = Counter()
   for line in openfile:
      words = line.split()
      c.update(words)
like image 142
Ashwini Chaudhary Avatar answered Dec 31 '22 22:12

Ashwini Chaudhary