Every character in the English language has a percentage of occurrence, these are the percentages:
A B C D E F G H I
.0817 .0149 .0278 .0425 .1270 .0223 .0202 .0609 .0697
J K L M N O P Q R
.0015 .0077 .0402 .0241 .0675 .0751 .0193 .0009 .0599
S T U V W X Y Z
.0633 .0906 .0276 .0098 .0236 .0015 .0197 .0007
A list called letterGoodness
is predefined as:
letterGoodness = [.0817,.0149,.0278,.0425,.1270,.0223,.0202,...
I need to find the "goodness" of a string. For example the goodness of 'I EAT' is: .0697 + .1270 + .0817 + .0906 =.369. This is part of a bigger problem, but I need to solve this to solve the big problem. I started like this:
def goodness(message):
for i in L:
for j in i:
So it will be enough to find out how to get the occurrence percentage of any character. Can you help me? The string contains only uppercase letters and spaces.
letterGoodness is better as a dictionary, then you can just do:
sum(letterGoodness.get(c,0) for c in yourstring.upper())
# #^.upper for defensive programming
To convert letterGoodness
from your list to a dictonary, you can do:
import string
letterGoodness = dict(zip(string.ascii_uppercase,letterGoodness))
If you're guaranteed to only have uppercase letters and spaces, you can do:
letterGoodness = dict(zip(string.ascii_uppercase,letterGoodness))
letterGoodness[' '] = 0
sum(letterGoodness[c] for c in yourstring)
but the performance gains here are probably pretty minimal so I would favor the more robust version above.
If you insist on keeping letterGoodness
as a list (and I don't advise that), you can use the builtin ord
to get the index (pointed out by cwallenpoole):
ordA = ord('A')
sum(letterGoodness[ord(c)-ordA] for c in yourstring if c in string.ascii_uppercase)
I'm too lazy to timeit
right now, but you may want to also define a temporary set to hold string.ascii_uppercase
-- It might make your function run a little faster (depending on how optimized str.__contains__
is compared to set.__contains__
):
ordA = ord('A')
big_letters = set(string.ascii_uppercase)
sum(letterGoodness[ord(c)-ordA] for c in yourstring.upper() if c in big_letters)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With