I'm working on a crossword-like problem, but I don't know how to design the algorithm.
For example:
So the algorithm's task is to make correct words which are stored in dictionary.
app -> lapp -> leapp -> lecapp -> .... -> lappe -> eappc -> ... -> appl -> apple (correct answer)
What is the best solution for this algorithm?
ALGO is not a valid scrabble word.
No. To make it clear right now, IQ is not a valid word in Scrabble. This is according to the official Scrabble dictionary, even though IQ is a legitimate word in the dictionary. That's because, generally speaking, you cannot use abbreviations in Scrabble.
The player's letter rack is visible at the bottom of the screen. The player types a word composed of letters from the rack, and if the word is acceptable by the game, the player moves the cursor to the game board to position the word onscreen and score the move.
You might be interested in Googling the research paper "The World's Fastest Scrabble Program" by Appel and Jacobson (1988). The algorithms are outlined in pseudocode, so it takes a bit of work to mold that into useable form and glue it all together; however, the program the authors outline works great.
Store your dictionary as a tree, for example:
*
|
+----+----+
| |
A* B
| |
+--+--+ E*
| | |
P S +-+-+
| | | |
P* K* A E*
| |
+-+-+ +-+-+
| | | |
E L D* N*
| |
A E*
|
L*
Thanks paxdiablo for making my tree more readable.
This tree has the words a, app, appeal, apple, ask, bead, bean, be, and bee. The nodes marked with an asterisk indicate "If I were to stop here, this would be a valid word" such as the 'e' below 'b' for 'be'.
When you find a letter you don't know, use a wildcard (i.e., pick all children and recurse down all paths).
You said crossword, but then your "letters...for making words" seemed to indicate Scrabble. This would work for either. Not the fastest, but plenty fast.
Thanks Andreas for reminding us that this is called a trie.
If you wanted to say "the second letter is a P" you would start from the root node and take every branch (which would be every letter in the alphabet, assuming it's a proper dictionary) and then the "P" branch and then go on from there.
I've actually written a crossword program before (cryptic but the theory behind the construction is identical).
I had a database of words and their clues which could be sorted by times used (so that I wouldn't tend to get duplicate crosswords on subsequent runs).
The first thing you should do is design your patterns (blacks where you can't put letters and whites where you can). Trying to fit words into a grid while creating the pattern on the fly is very time consuming and prone to errors. If you look at most crosswords, they tend to follow certain rules to make it easier. Things like being symmetrical around one of the diagonals and disallowing a square of four white cells (to ease the task of selecting suitable words).
Once you have the pattern, then you start finding words to place in it. That way, you would know that "app" was the start of the word and be able to limit your searches to those starting with "app", not every word that has "app" in it. Similarly for words where you have known letters at any positions. It's a lot easier to locate words with letters at known positions than to evaluate those letters at any starting position within a word.
Mine ended up being written in shell script (believe it or not) and using the dictionary that came from Linux as a word search tool. If you know you have a 5-letter word starting with "app", it's quite easy to use:
grep '^app..$' words.txt
to get a list of all valid possibilities.
And, as each word was found, it was copied to a clues.txt file that contained both the word and multiple possible clues. The actual format was use {count, word, clue} where the same word may exist on multiple lines with a different clue - this allowed piping of the grep
through sort
so that lesser used words/clues floated to the top (whenever a word/clue was used, its count was incremented, making it less likely to be used next time).
Once that file was a decent size, the program would use it first to locate words and, only if one wasn't found would it revert to the words file (sans clues) where manual intervention would be required.
It actually ended up being quite good at doing the job. It wasn't blindingly fast but I didn't need to generate one every three seconds - this was for a community newsletter sent out once a week.
Now that you've changed the question to a Scrabble variant, that's actually much harder.
You need to take into account the letters you have, the letters on the board and the fact that there's a lot more places that you have to evaluate. This makes a brute force method much harder.
What I would do as an initial cut would be to select possibilities (starting position and direction on the board) chosen randomly, then use the same algorithm as for the crossword variant above to locate all words that can fit there. Then, if you have the letters to satisfy that word, store it (along with its score) in a list.
Keep in mind that you need to watch out for interfering with other words on the board.
I would continue to examine possibilities until one of:
That last one's important - if you're playing a beginner, you don't want to exhaustively examine millions of possibilities.
Then, choose the best move from your list (or maybe not the best if playing at beginner level - it all depends on how good you want the computer to be).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With