Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Word count from a txt file program

Tags:

I am counting word of a txt file with the following code:

#!/usr/bin/python file=open("D:\\zzzz\\names2.txt","r+") wordcount={} for word in file.read().split():     if word not in wordcount:         wordcount[word] = 1     else:         wordcount[word] += 1 print (word,wordcount) file.close(); 

this is giving me the output like this:

>>>  goat {'goat': 2, 'cow': 1, 'Dog': 1, 'lion': 1, 'snake': 1, 'horse': 1, '': 1, 'tiger': 1, 'cat': 2, 'dog': 1} 

but I want the output in the following manner:

word  wordcount goat    2 cow     1 dog     1..... 

Also I am getting an extra symbol in the output (). How can I remove this?

like image 454
user3068762 Avatar asked Jan 14 '14 06:01

user3068762


People also ask

How do I get a word count in a text file?

To count the number of words in only part of your document, select the text you want to count. Then on the Tools menu, click Word Count. Just like the Word desktop program, Word for the web counts words while you type.

How do I count a text file?

The most easiest way to count the number of lines, words, and characters in text file is to use the Linux command “wc” in terminal. The command “wc” basically means “word count” and with different optional parameters one can use it to count the number of lines, words, and characters in a text file.


2 Answers

The funny symbols you're encountering are a UTF-8 BOM (Byte Order Mark). To get rid of them, open the file using the correct encoding (I'm assuming you're on Python 3):

file = open(r"D:\zzzz\names2.txt", "r", encoding="utf-8-sig") 

Furthermore, for counting, you can use collections.Counter:

from collections import Counter wordcount = Counter(file.read().split()) 

Display them with:

>>> for item in wordcount.items(): print("{}\t{}".format(*item)) ... snake   1 lion    2 goat    2 horse   3 
like image 197
Tim Pietzcker Avatar answered Oct 14 '22 08:10

Tim Pietzcker


#!/usr/bin/python file=open("D:\\zzzz\\names2.txt","r+") wordcount={} for word in file.read().split():     if word not in wordcount:         wordcount[word] = 1     else:         wordcount[word] += 1 for k,v in wordcount.items():     print k, v 
like image 38
bistaumanga Avatar answered Oct 14 '22 08:10

bistaumanga