I am trying to count frequencies of words in a text file using python.
I am using the following code:
openfile=open("total data", "r")
linecount=0
for line in openfile:
if line.strip():
linecount+=1
count={}
while linecount>0:
line=openfile.readline().split()
for word in line:
if word in count:
count[word]+=1
else:
count[word]=1
linecount-=1
print count
But i get an empty dictionary. "print count" gives {} as output
I also tried using:
from collections import defaultdict
.
.
count=defaultdict(int)
.
.
if word in count:
count[word]=count.get(word,0)+1
But i'm getting an empty dictionary again. I dont understand what am i doing wrong. Could someone please point out?
This loop for line in openfile:
moves the file pointer at the end of the file.
So, if you want to read the data again then either move the pointer(openfile.seek(0)
) to the start of the file or re-open the file.
To get the word frequency better use Collections.Counter
:
from collections import Counter
with open("total data", "r") as openfile:
c = Counter()
for line in openfile:
words = line.split()
c.update(words)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With