How to create txt frequency counter with all letters (a-z) in python 3

Tags:

python-3.x

I have a text file named textf that looks something like the following:

rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g

I want to do a frequency count for each letter in the text file but I want it with the condition that if a letter does not appear in the text, it should have a key:value pair with value 0. For example if z was not in the text it should look something like 'z': 0 and so on for all letters (a to z). I did the following code:

import string  
from collections import Counter 
with open("textf.txt") as tf: 
    letter = tf.read()
letter_count = Counter(letter.translate(str.maketrans('','',string.punctuation)))
print("Frequency count of letter:","\n",letter_count)

But the output looks something like this:

Counter({' ': 110, 'r': 12, 'c': 88, 'a': 55, 'g': 57, 'w': 76, 'm': 76, 'x': 72, 'u': 70, 'q': 41, 'y': 40, 'j': 36, 'l': 32, 'b': 18, 'd': 28, 'v': 27, 'k': 22, 't': 19, 'f': 18, 'z': 16, 'i': 7})

I am trying to make it so that the space count ' ': 110 is not shown and that I have all the letters(a-z) and when the letter does not appear in the text that my result prints something like 'n': 0 and so on. Any ideas or suggestions of how I could make this possible?

249

asked Oct 03 '17 15:10

adda.fuentes

2 Answers

One way to do this is to make a normal dict from your Counter, using the lowercase letters as the keys of the new dict. We use the dict.get method to supply a default value of zero for missing letters.

import string  
from collections import Counter 

letter = "rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g"

letter_count = Counter(letter.translate(str.maketrans('','',string.punctuation)))
letter_count = {k: letter_count.get(k, 0) for k in string.ascii_lowercase}
print("Frequency count of letter:\n", letter_count)

output

Frequency count of letter:
 {'a': 9, 'b': 3, 'c': 8, 'd': 4, 'e': 0, 'f': 1, 'g': 12, 'h': 0, 'i': 1, 'j': 1, 'k': 2, 'l': 2, 'm': 10, 'n': 0, 'o': 0, 'p': 0, 'q': 4, 'r': 14, 's': 0, 't': 2, 'u': 5, 'v': 4, 'w': 9, 'x': 6, 'y': 3, 'z': 2}

If you do this in Python 3.6+ you get the side-benefit that the new dict is alphabetically sorted (although that behaviour is currently just an implementation detail that should not be relied upon).

As user2357112 mentions in the comments, we don't need to use letter_count.get(k, 0), since a Counter automatically returns zero if we try to read the value of a non-existent key. So that dict comprehension can be changed to

letter_count = {k: letter_count[k] for k in string.ascii_lowercase}

answered Sep 21 '22 05:09

PM 2Ring

You can do this like so:

x = "rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g"

import string

freq = {i:0 for i in string.ascii_lowercase}
for i in x:
    if i in freq:
        freq[i] += 1

You can also replace the for-loop with a dictionary-comprehension (though it's inefficient for what we are trying to do since it uses count - but added as a way just for reference):

freq = {i:x.count(i) for i in freq}

This will give as a result:

{'a': 9, 'c': 8, 'b': 3, 'e': 0, 'd': 4, 'g': 12, 'f': 1, 'i': 1, 'h': 0, 'k': 2, 'j': 1, 'm': 10, 'l': 2, 'o': 0, 'n': 0, 'q': 4, 'p': 0, 's': 0, 'r': 14, 'u': 5, 't': 2, 'w': 9, 'v': 4, 'y': 3, 'x': 6, 'z': 2}

answered Sep 19 '22 05:09

coder

Related questions
                            
                                How can I represent this regex to not get a "bad character range" error?
                            
                                How to install OpenCV on Windows and enable it for PyCharm without using the package manager
                            
                                Calling async_result.get() from within a celery task
                            
                                Removing HTML tags without /text().extract()
                            
                                How to test url in django
                            
                                TemplateSyntaxError: 'with' expected with atleast one variable assignment
                            
                                How to use scipy.optimize minimize_scalar when objective function has multiple arguments?
                            
                                Take difference between two column of pandas dataframe based on condition in python
                            
                                Simplest way for PyQT Threading
                            
                                Streaming two line graphs using bokeh
                            
                                Installing opencv in python
                            
                                Print multiline string variable without indent [duplicate]
                            
                                Python Get minimum value without -inf
                            
                                Loading two models from Saver in the same Tensorflow session
                            
                                How to get the current jupyter notebook servers in python?
                            
                                Tensorflow : Graph is finalized and cannot be modified
                            
                                How do you set a timeout in Python's gRPC Library
                            
                                Print pi to a number of decimal places
                            
                                how to change xticks font size in a matplotlib plot [duplicate]
                            
                                Pylint ungrouped-imports warning

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With