I am working on detecting rhymes in Python using the Carnegie Mellon University dictionary of pronunciation, and would like to know: How can I estimate the phonemic similarity between two words? In other words, is there an algorithm that can identify the fact that "hands" and "plans" are closer to rhyming than are "hands" and "fries"?
Some context: At first, I was willing to say that two words rhyme if their primary stressed syllable and all subsequent syllables are identical (c06d if you want to replicate in Python):
def create_cmu_sound_dict():
final_sound_dict = {}
with open('resources/c06d/c06d') as cmu_dict:
cmu_dict = cmu_dict.read().split("\n")
for i in cmu_dict:
i_s = i.split()
if len(i_s) > 1:
word = i_s[0]
syllables = i_s[1:]
final_sound = ""
final_sound_switch = 0
for j in syllables:
if "1" in j:
final_sound_switch = 1
final_sound += j
elif final_sound_switch == 1:
final_sound += j
final_sound_dict[word.lower()] = final_sound
return final_sound_dict
If I then run
print cmu_final_sound_dict["hands"]
print cmu_final_sound_dict["plans"]
I can see that hands and plans sound very similar. I could work towards an estimation of this similarity on my own, but I thought I should ask: Are there sophisticated algorithms that can tie a mathematical value to this degree of sonic (or auditory) similarity? That is, what algorithms or packages can one use to mathematize the degree of phonemic similarity between two words? I realize this is a large question, but I would be most grateful for any advice others can offer on this question.
Cheat.
#!/usr/bin/env python
from Levenshtein import *
if __name__ == '__main__':
s1 = ['HH AE1 N D Z', 'P L AE1 N Z']
s2 = ['HH AE1 N D Z', 'F R AY1 Z']
s1nospaces = map(lambda x: x.replace(' ', ''), s1)
s2nospaces = map(lambda x: x.replace(' ', ''), s2)
for seq in [s1, s2, s1nospaces, s2nospaces]:
print seq, distance(*seq)
Output:
['HH AE1 N D Z', 'P L AE1 N Z'] 5
['HH AE1 N D Z', 'F R AY1 Z'] 8
['HHAE1NDZ', 'PLAE1NZ'] 3
['HHAE1NDZ', 'FRAY1Z'] 5
Library: https://pypi.python.org/pypi/python-Levenshtein/0.11.2
Seriously, however, since you only have text as input and pretty much the text-based CMU dict, you're limited to some sort of manipulation of the text input; but the way I see it, there's only a limited number of phonems available, so you could take the most important ones and assign "phonemic weights" to them. There's only 74 of them in the CMU dictionary you pointed to:
% cat cmudict.06.txt | grep -v '#' | cut -f 2- -d ' ' | tr ' ' '\n' | sort | uniq | wc -l
75
(75 minus one for empty line)
You'd probably get better results if you've done smth more advanced in step 2: assign weights to particular phonem combinations. Then you could modify some Levenshtein-type distance metric, e.g. in the library above, to come up with reasonably performing "phonemic distance" metric working on text inputs.
Not much work for step 3: profit.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With