Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I determine if a random string sounds like English?

Tags:

I have an algorithm that generates strings based on a list of input words. How do I separate only the strings that sounds like English words? ie. discard RDLO while keeping LORD.

EDIT: To clarify, they do not need to be actual words in the dictionary. They just need to sound like English. For example KEAL would be accepted.

like image 434
Ozgur Ozcitak Avatar asked Sep 18 '08 12:09

Ozgur Ozcitak


1 Answers

You can build a markov-chain of a huge english text.

Afterwards you can feed words into the markov chain and check how high the probability is that the word is english.

See here: http://en.wikipedia.org/wiki/Markov_chain

At the bottom of the page you can see the markov text generator. What you want is exactly the reverse of it.

In a nutshell: The markov-chain stores for each character the probabilities of which next character will follow. You can extend this idea to two or three characters if you have enough memory.

like image 59
Nils Pipenbrinck Avatar answered Oct 02 '22 15:10

Nils Pipenbrinck