Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I generate a random English "sounding" word in .Net?

I know there have been several posts about random word generation based on large dictionaries or web lookups. However, I'm looking for a word generator which I can use to create strong password without symbols. What I'm looking for is a reliable mechanism to generate a random, non recognised, English word of a given length.

An example of the type of word would be "ratanta" etc.

Are there any algorithms that understand compatible syllables and therefore generate a pronouncable output string? I know that certain captcha style controls generate these types of words but I'm unsure whether they use an algorithm or whether they are sourced from a large set as well.

If there are any .Net implementations of this type of functionality I would be very interested to know.

like image 638
Brian Scott Avatar asked Jul 30 '10 13:07

Brian Scott


People also ask

What is random word generator?

The Random Word Generator is a tool to help you create a list of random words. There are many reasons one might be interested in doing this, and you're likely here because you're interested in creating a random word list. This tool can help you do exactly that. The tool is easy to use.


2 Answers

There are several things you can do:

1) Research English syllable structure, and generate syllables following those rules

2) Employ Markov chains to get a statistical model of English phonology.

There are plenty of resources on Markov chains, but the main idea is to record the probability of there being any particular letter after a certain sequence. For instance, after "q", "u" is very very likely; after "k", "q" is very very unlikely (this employs 1-length Markov chains); or, after "th", "e" is very likely (this employs 2-length Markov chains).

If you go the syllable model route, you can use resources like this to help you elucidate your intuitions about your language.

UPDATE:

3) You can make it much simpler by not simulating full English, but, say, Japanese, or Italian, where rules are much easier, and if it's a nonsense word it is as easy to remember as a nonsense English word. For instance, Japanese only has about 94 valid syllables (47 short, 47 long), and you can list all of them easily and pick at random.

like image 173
Amadan Avatar answered Sep 22 '22 17:09

Amadan


I'd use a Markov chain algorithm for this.

In summary:

  1. Build a dictionary. Iterate through the letters in an example piece of English text. Build a data structure that maps pairs of letters. Against each pair, record a probability that the second letter appears immediately after the first.
  2. Generate your text. Using the map that you built in (1), pick a sequence of random letters. When deciding what letter to write next, look at the letter you wrote most recently, and use that letter to determine the probability of the next letter.
like image 25
Tim Robinson Avatar answered Sep 21 '22 17:09

Tim Robinson