Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to unscramble words with PHP?

Tags:

algorithm

php

I have a word list and I want to unscramble words using this word list, in PHP.

It seems to me that PHP doesn't have a built-in function that does this. So could someone please suggest a good algorithm to do this, or at least point me in the right direction?

EDIT: edited to add example

So basically, what I'm talking about is I have a list of words:

   apple
   banana
   orange

Then, I'm given a bunch of jumbled letters.

   pplea
   nanaba
   eroang
like image 899
rgin Avatar asked Dec 28 '22 01:12

rgin


2 Answers

Given a dictionary of known words:

foreach ($list as $word)
{
  if (count_chars($scrambled_word,1) == count_chars($word,1))
    echo "$word\n";
}

Edit: A simple optimization would be to move the count_chars($scrambled_word,1)) outside the loop since it never changes:

$letters = count_chars($scrambled_word,1)
foreach ($list as $word)
{
  if ($letters == count_chars($word,1))
    echo "$word\n";
}
like image 148
Matthew Avatar answered Jan 09 '23 22:01

Matthew


Warning: I rarely use PHP, so this is dealing only with a general algorithm that should work in almost any language, not anything specific to PHP.

Presumably you have a word in which the letters have been rearranged, and you want to find what word(s) could be made from those letters.

If that's correct, the general idea is fairly simple: take a copy of your word list, and sort the letters in each word into alphabetical order. Put the sorted and unsorted versions of each word side by side, and sort the whole thing by the sorted words (but keeping each unsorted word along with its sorted version). You may want to collapse duplicates together, so that (for example) instead of {abt : bat} and {abt : tab}, you have: {abt: bat, tab}

Then, to match up a scrambled word, sort its letters in alphabetical order. Look for matches in your dictionary (since it's sorted, you can use a binary search). When you find a match, the result is the word (or words) associated with that sorted letter group. Using the example above, if the scrambled word was "tba", you'd sort it to get "abt", then look up "abt" to get "bat" and "tab".

Edit: As @Moron pointed out in the comments, sorting and binary search aren't really crucial points in themselves. The basic points are to turn all equivalent inputs into identical keys, then use some sort of fast lookup by key to find the word(s) for that key.

Sorting the letters in each word is one easy way to turn equivalent inputs into identical keys. Sorting the list and doing a binary search is one easy way to do fast lookups by key.

In both cases, there are quite a few alternatives. I'm not at all sure the alternatives are likely to improve performance a lot, but they certainly could.

Just for example, instead of a pure binary search you could have a second level of index that told you where the keys starting with 'a' were, the keys starting with 'b', and so on. Given that a couple of extremely frequently-used letters are near the beginning of the alphabet ('e' and 'a', for example) you might be better off sorting the words so that relatively uncommon letters ('q', 'z', etc.) are toward the front of the key, and the most commonly used letters are at the end. This would give that first lookup based on the initial character the greatest discrimination.

On the sort/binary search side, there are probably more alternatives, and probably better arguments to be made in favor of using something else. Hash tables typically allow lookups in (nearly) constant time. Tries can reduce storage substantially, especially when many words share a common prefix. The only obvious disadvantage is that the code for either one is probably more work (though PHP's array type is hash-based, so you could probably use it quite nicely).

like image 22
Jerry Coffin Avatar answered Jan 09 '23 22:01

Jerry Coffin