Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Arpabet phonetic transcription

Tags:

python

Is there a library in python that can convert words (mainly names) to Arpabet phonetic transcription?

BARBELS -> B AA1 R B AH0 L Z

BARBEQUE -> B AA1 R B IH0 K Y UW2

BARBEQUED -> B AA1 R B IH0 K Y UW2 D

BARBEQUEING -> B AA1 R B IH0 K Y UW2 IH0 NG

BARBEQUES -> B AA1 R B IH0 K Y UW2 Z

like image 768
hmghaly Avatar asked Aug 11 '12 01:08

hmghaly


1 Answers

You can use a tiny utility from my listener project to do this. It uses espeak under the covers (to generate IPA), then uses a mapping extracted from the CMU dictionary to produce the set of ARPABet mappings that could match the IPA generated, for instance:

$ listener-arpa 
we are testing
we
        W IY
are
        ER
        AA
testing
        T EH S T IH NG

That produces exact-matches on the CMU dictionary about 45% of the time (I got around 36% using the documented correspondence in CMU/Wikipedia) while producing ~3 matches per word (on average). That said, we see a "close match" about 99% of the time, that is, while we might not precisely match the hand-marked-up word every time, we are generally off by only a few phonemes.

$ sudo apt-get install espeak
$ pip install -e git+https://github.com/mcfletch/listener.git#egg=listener
like image 135
vrplumber Avatar answered Oct 02 '22 22:10

vrplumber