Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generate letter grids with lots of words

Tags:

algorithm

The question is basically "how do I generate a good grid for the game 'Boggle' with lots of words" where good is defined as having lots of words of 5 or more letters.

Boggle is a game where you roll dice with letters on them, they are placed in a 4x4 grid. Example:

H S A V
E N I S
K R G I
S O L A

Words can be made by connecting letters horizontally, vertically or diagonally. In the good example grid above you can make the words "VANISHERS", "VANISHER", "KNAVISH", "ALIGNERS", "SAVINGS", "SINKERS" and around 271 other words depending on the dictionary used, for example "AS", "I", "AIR", "SIN", "IS", etc...

As a bad example this grid

O V W C
T K Z O
Y N J H
D E I E

only has ~44 words only 2 of which are > 4 letters long. "TYNED" and "HINKY".

There's lots of similar questions but AFAICT not this exact question. This is obviously a reference to the game "Scramble with Friends".

The first solution, picking letters at random, has the problem that if you accidently pick all consonants there will be no words. Adding a few random vowels is not enough to guarantee a good set of words. You might only get 1 to 4 letter words whereas a good algorithm will choose a set of letters that has > 200 words with many words > 7 letters.

I'm open to any algorithm. Obviously I could write code to brute force solutions finding every possible grid and then sorting them by grids with the most words but that simple solution would take forever to run.

I can imagine various heuristics like choosing a long word (8-16 letters), putting those letters in the grid at random but in a way that can actually still make the word and then filling in the left over spaces. I suspect that's also not enough to guarantee a good set of words though I haven't tried it yet.

It's possible the solution requires pre-processing a dictionary to know common parts of words. For example all words that end in "ing" or "ers" or "ght" or "tion" or "land". Or somehow organizing them into a graph of shared letters. Maybe weighting certain sets of letters so "ing" or "ers" are inserted often.

Ideas?

like image 839
gman Avatar asked Aug 11 '13 23:08

gman


1 Answers

Short of the brute-force search proposal there is probably no way to guarantee that you have a good grid. If you use the letter frequency as found on the Boggle dice, then you will get 'average' grids (exactly as if you roll the dice). You could improve this by adding extra heuristics or filters, for example:

 ensure that (almost) every consonant is 'in-reach-of' a vowel
 ensure 'Q' is 'in-reach-of' a 'U'
 ensure the ratio of vowels to consonants is within a set range
 ensure the number of rare consonants is not too large
 etc

Then you could

 set letters using weighted letter frequency
 change (swap/replace) letters not meeting your heuristics

It would still be possible for a bad grid to get through unless you checked via brute-force, but you may be able to reduce the number of bad grids substantially from those returned by a simple randomly generated grid.

Alternately, generate random grids and do the brute force work as required to pick good grids. But do this in the background (days or weeks before needed). Then store a bunch of good grids and choose one randomly as required when needed (and cross it off your list so you don't see it again).

like image 137
Penguino Avatar answered Sep 27 '22 23:09

Penguino