Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where to get a list of almost all the words in English language? [closed]

Tags:

text

random

I want to get some random text generated.

I tried writing a basic Java programme,

int nowords = r.nextInt(2000);
        int i, j;

        for (i = 0; i < nowords; i++) {
            int lengthofword = r.nextInt(10) + 2;
            for (j = 0; j < lengthofword; j++) {
                int ch = r.nextInt(26);
                System.out.print(alphabet[ch]);
            }
            System.out.print(" ");
        }

and the result is something like:

tafawc flnqhabhv mqceuoqy rttzckzqa bdyxzod zbxweclvia wegmxvuoqez ijwauhmzw joxm zvphbs ogpjyip qxoymxkxv yrfoifig fbhecph izxcyfma xarzse srwic jgi fkbcdcydpz qpdvsz rqhjieqno fmelfmtgqe qozenjlxtg vfxd lkmkrksgw ytuaduknsl let ao bm lsfjednsa qouinii yrwzerdck yb kszttly zmwflwevyix kdg qpnkzuijva ssau yc wxews drqsdwbc glxb gokunixldec lznuwdvksx zkzhsirruxc sqplhv fzixywkaft fqdkumfgddn bcqp oiwwbo emhk kv qhm xkjp kacbmcd ojh wzvukx oztbexkf lylyv kdspqpa zbykj lnprtlxp af bne ryamumcg oyhldwdlq bqyfxrszuf wyrijnr ysnefsz lhhazrdwsev tll ikibsnpqwg ntzlgc aahfsdeups rushos ihqzyucd mjorscchszm tuppz hxi ssumrevg

It would be helpful if the text was at least readable instead of this.

I am thinking of using English words and randomly pick from among them to make sentences. Where can I get a big list of words in English language?

like image 306
Moeb Avatar asked Oct 20 '09 11:10

Moeb


People also ask

How many English words are there totally?

If we want to talk about how many words there are in English, there are three key numbers to remember: more than a million total words, about 170,000 words in current use, and 20,000-30,000 words used by each individual person.

Is it possible to know every word in English?

When you use the word counter in Google Docs to count the number of words you've typed in an essay, that number will only be a fraction of the total number of words in English. There is just no way that someone can know and use daily every word in the English language.


2 Answers

The gold standard for natural language processing is Wordnet at http://wordnet.princeton.edu/. This has an active user group, has semantics and syntax associated with words, and interfaces with other NLP tools. If you are thinking of doing computation with the words you should definitely have a look.

However selecting words at random does not generate a useful sentence and I suspect you will be disappointed with the results. Have a look at toolkits such as OpenNLP where there are many tools including part-of-speech (POS) which you will certainly need.

Even when you have sentences that may have valid syntax, you will need to read the work of Chomsky and others. His "Colorless green ideas sleep furiously" http://en.wikipedia.org/wiki/Colorless_green_ideas_sleep_furiously illustrates the problem.

like image 59
peter.murray.rust Avatar answered Sep 20 '22 08:09

peter.murray.rust


Check for Lorem Ipsum on site http://www.lipsum.com/ for generating "Void text"

There are lot of generators on net http://loremipsum.sourceforge.net/

Reference text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed consectetur viverra fringilla. Donec at lectus at turpis bibendum placerat. Vivamus non nibh mauris. Nulla metus metus, sollicitudin nec egestas id, fermentum at nisl. Pellentesque at nisl est. In nec sem tellus, ac imperdiet lectus. Pellentesque tortor turpis, sagittis vel facilisis tristique, cursus in tortor. Mauris non neque magna, vel dignissim sem. Suspendisse interdum diam tempus dui mattis molestie. Donec in mauris urna, at vulputate ipsum. Sed sodales venenatis quam non tincidunt.

like image 37
Luka Rahne Avatar answered Sep 21 '22 08:09

Luka Rahne