I am trying to build a system that accepts text and outputs the phonetic spelling of the words of this text. Any ideas on what libraries can be used in Python and Java?
Check out soundex
http://en.wikipedia.org/wiki/Soundex
I came across an old python package Raze. It includes a phonetic module with a translation api:
>>> pd = PhoneticDictionary()
>>> pd.spell('Hello world')
... hotel-echo-lima-lima-oscar whiskey-oscar-romeo-lima-delta
It hasn't been updated in a while, but it still works.
Are you looking for something akin to the international phonetic alphabet (IPA) or some other phonetic output? If ARPAbet is ok, there is the CMU pronouncing dictionary (http://www.speech.cs.cmu.edu/cgi-bin/cmudict). That'll give the ARPAbet rendering of most words in English. I've written some code that converts the ARPAbet spelling to IPA and post to github if you'd like.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With