Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Natural Language Generation in PHP

Tags:

php

nlp

I woke up last night with a thought in my head: Can PHP be used to generate random words that sound natural? (Like the Lorem ipsum verses).

  1. Words being single letter: 'a,e,i,o,u'
  2. Words being double letter: any combination of vowel and consonant.
  3. Maximum word length would be I think six letters.

The purpose would be to fill space on website templates with this instead of 'Lorem ipsum', or send test emails for certain PHP scripts to make sure mail() works.

But my thoughts on how it would work are that PHP would generate random length words, 1-6 letters each, with a few "don't do this" rules like "no two single-letter words next to each other" or "no three-vowels in a row" or "no three-consonants in a row" and automatically add punctuation and capitalization after between 4 and 8 words to a sentence.

Would this be at all possible, and if so, are there any pre-existing classes or functions I could implement?

like image 921
ionFish Avatar asked Nov 03 '22 21:11

ionFish


1 Answers

You can take the context-free grammar approach: http://en.wikipedia.org/wiki/Context-free_grammar

<word> := <vowel> | <consonant><remaining word following consonant> | <vowel><remaining word following vowel>
<vowel> := a|e|i|o|u
<consonant> := b|c|d|f|g|...
<word following vowel> := <consonant><remaining word following consonant>
...and so on

Implement that grammar in any procedural language (C and PHP included), then start generating words based on the grammar.

I don't know of any generic PHP parsing frameworks but you can look at best practices for writing them: Best practices for writing a programming language parser

like image 192
Rey Gonzales Avatar answered Nov 09 '22 13:11

Rey Gonzales