Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count letters in a text file

Tags:

python

I am a beginner python programmer and I am trying to make a program which counts the numbers of letters in a text file. Here is what I've got so far:

import string 
text = open('text.txt')
letters = string.ascii_lowercase
for i in text:
  text_lower = i.lower()
  text_nospace = text_lower.replace(" ", "")
  text_nopunctuation = text_nospace.strip(string.punctuation)
  for a in letters:
    if a in text_nopunctuation:
      num = text_nopunctuation.count(a)
      print(a, num)

If the text file contains hello bob, I want the output to be:

b 2
e 1
h 1
l 2
o 2

My problem is that it doesn't work properly when the text file contains more than one line of text or has punctuation.

like image 750
user2752551 Avatar asked Sep 05 '13 23:09

user2752551


2 Answers

This is very readable way to accomplish what you want using Counter:

from string import ascii_lowercase
from collections import Counter

with open('text.txt') as f:
    print Counter(letter for line in f 
                  for letter in line.lower() 
                  if letter in ascii_lowercase)

You can iterate the resulting dict to print it in the format that you want.

like image 145
elyase Avatar answered Nov 14 '22 12:11

elyase


You have to use collections.Counter

from collections import Counter
text = 'aaaaabbbbbccccc'
c = Counter(text)
print c

It prints:

Counter({'a': 5, 'c': 5, 'b': 5})

Your text variable should be:

import string
text = open('text.txt').read()
# Filter all characters that are not letters.
text = filter(lambda x: x in string.letters, text.lower())

For getting the output you need:

for letter, repetitions in c.iteritems():
    print letter, repetitions

In my example it prints:

a 5
c 5
b 5

For more information Counters doc

like image 1
moliware Avatar answered Nov 14 '22 12:11

moliware