Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Machine learning challenge: learn english pronunciation

Say you want to take CMU's phonetic data set input that looks like this:

ABERRATION  AE2 B ER0 EY1 SH AH0 N
ABERRATIONAL  AE2 B ER0 EY1 SH AH0 N AH0 L
ABERRATIONS  AE2 B ER0 EY1 SH AH0 N Z
ABERT  AE1 B ER0 T
ABET  AH0 B EH1 T
ABETTED  AH0 B EH1 T IH0 D
ABETTING  AH0 B EH1 T IH0 NG
ABEX  EY1 B EH0 K S
ABEYANCE  AH0 B EY1 AH0 N S

(The word is to the left, to the right are a series of phonemes, key here)

And you want to use it as training data for a machine learning system that would take new words and guess how they would be pronounced in English.

It's not so obvious to me at least because there isn't a fixed token size of letters which could possible map to a phoneme. I have a feeling that something to do with a markov chain might be the right way to go.

How would you do this?

like image 939
ʞɔıu Avatar asked Mar 23 '09 14:03

ʞɔıu


People also ask

Is Elsa English free?

ELSA Speak Free LanguagesELSA Speak has a free version you can use without paying. The free version limits many of the features you can use. You have access to a few lessons, but don't have access to any of the tools in the Pro version, such as an analysis of your speaking score.

Is there any app to check English pronunciation?

FluentU (Android/iOS) Not only do you get to see the language used naturally, but you also get to hear native pronunciations of every word. Anywhere that you see a word, you're able to check its definition, hear an audio pronunciation and see the word used in other videos.


2 Answers

The problem is called Grapheme-to-phoneme conversion, a subproblem of Natural Language Processing. Google brings up a few papers.

like image 70
Frank Avatar answered Oct 16 '22 03:10

Frank


Not entirely my field, but maybe build a neural network with several layers - earlier layers to guess the splitting of the words into sequential syllables, the later layers to guess the pronounciation of the said syllables.

Setting up a ANFIS-learning neural network is fairly straightforward for numerical data, for literal/phonetic data the task is undoubtedly several orders more complex.

like image 42
Jukka Dahlbom Avatar answered Oct 16 '22 05:10

Jukka Dahlbom