Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

scrabble solving with maximum score

I was asked a question

You are given a list of characters, a score associated with each character and a dictionary of valid words ( say normal English dictionary ). you have to form a word out of the character list such that the score is maximum and the word is valid.

I could think of a solution involving a trie made out of dictionary and backtracking with available characters, but could not formulate properly. Does anyone know the correct approach or come up with one?

like image 474
dharakk Avatar asked Apr 28 '15 11:04

dharakk


2 Answers

First iterate over your letters and count how many times do you have each of the characters in the English alphabet. Store this in a static, say a char array of size 26 where first cell corresponds to a second to b and so on. Name this original array cnt. Now iterate over all words and for each word form a similar array of size 26. For each of the cells in this array check if you have at least as many occurrences in cnt. If that is the case, you can write the word otherwise you can't. If you can write the word you compute its score and maximize the score in a helper variable.

This approach will have linear complexity and this is also the best asymptotic complexity you can possibly have(after all the input you're given is of linear size).

like image 121
Ivaylo Strandjev Avatar answered Nov 03 '22 20:11

Ivaylo Strandjev


Inspired by Programmer Person's answer (initially I thought that approach was O(n!) so I discarded it). It needs O(nr of words) setup and then O(2^(chars in query)) for each question. This is exponential, but in Scrabble you only have 7 letter tiles at a time; so you need to check only 128 possibilities!

First observation is that the order of characters in query or word doesn't matter, so you want to process your list into a set of bag of chars. A way to do that is to 'sort' the word so "bac", "cab" become "abc".

Now you take your query, and iterate all possible answers. All variants of keep/discard for each letter. It's easier to see in binary form: 1111 to keep all, 1110 to discard the last letter...

Then check if each possibility exists in your dictionary (hash map for simplicity), then return the one with the maximum score.

import nltk
from string import ascii_lowercase
from itertools import product

scores = {c:s for s, c in enumerate(ascii_lowercase)}
sanitize = lambda w: "".join(c for c in w.lower() if c in scores)
anagram = lambda w: "".join(sorted(w))

anagrams = {anagram(sanitize(w)):w for w in nltk.corpus.words.words()}

while True:
    query = input("What do you have?")
    if not query: break

    # make it look like our preprocessed word list
    query = anagram(sanitize(query))

    results = {}

    # all variants for our query
    for mask in product((True, False), repeat=len(query)):
        # get the variant given the mask
        masked = "".join(c for i, c in enumerate(query) if mask[i])
        # check if it's valid
        if masked in anagrams:
            # score it, also getting the word back would be nice
            results[sum(scores[c] for c in masked)] = anagrams[masked]

    print(*max(results.items()))
like image 31
csiz Avatar answered Nov 03 '22 21:11

csiz