In the game Hangman, is it the case that a greedy letter-frequency algorithm is equivalent to a best-chance-of-winning algorithm?
Is there ever a case where it's worth sacrificing preservation of your remaining lives, for the sake of a better chance of guessing the correct answer?
Further clarification of the problem:
Motivation: This question is inspired by the interesting discussion at http://www.datagenetics.com/blog/april12012/index.html
They suggest an algorithm for solving the word game "Hangman" optimally.
Their strategy can be summarised thus (edited for clarification):
At each stage, we are guessing the letter (not previously guessed) which occurs in the largest number of remaining possible words.
There is some motivation to like this algorithm - we are always minimally likely to lose a life.
But, it strikes me that this isn't necessarily the best solution: if we're trying to guess the word (within a certain number of lives), it's not necessarily always the case that the most frequently occurring letter is the most useful letter to distinguish between the remaining available words.
I'm not sure, though, as it does seem opportune to avoid losing a life wherever possible. Will it ever be the case that optimal strategy permits us to sacrifice a life for a better opportunity to win?
Question: is it the case that this greedy algorithm is equivalent to a best-chance-of-winning algorithm, or not? And can you prove it?
An example dictionary+game would be ideal to show a disproof.
We select the most frequent letter from the selected subset of words and guess about the given letter. For every guesses the guessed letter will be marked as guessed and they won't be guessed again in the future. This way you have much better chances of survival than in the greedy algorithm described in your question.
The hardest word to guess in hangman, according to science, is: Jazz. Composed of 75 percent uncommon letters (J and Z) and allowing only three chances at picking correctly, jazz is the perfect storm of Hangman trickery. It's kind of fitting, really. They say jazz is all about the notes they don't play.
The sequence above represents the usage order of letters in the English language, with the letter 'E' being the most common letter, followed by the letter 'T', all the way down to the letter 'Z', the least commonly used. So, the first letter we should guess when trying to solve a hangman is the letter 'E', right?
Assume the following dictionary: ABC ABD AEF EGH
. (I'll capitalize unguessed letters.)
Assume you have only 1 life (makes the proof so much easier...).
The algorithm proposed above:
Starting with A
, you lose (1/4) or are left with aBC aBD aEF
(3/4).
Now guess B
and lose (1/3) or are left with abC abD
(2/3).
Now guess C
or D
and you lose (1/2) or win (1/2).
Probability to win: 3/4 * 2/3 * 1/2 = 1/4.
Now try something else:
Starting with E
, you lose (1/2) or are left with AeF eGH
(1/2).
Now you know what to guess and win.
Probability to win: 1/2.
Clearly the proposed algorithm is not optimal.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With