In a word game similar to Ruzzle or Letterpress, where users have to construct words out of a given set of letters:
I keep my dictionary in a simple SQL table:
create table good_words (
word varchar(16) primary key
);
Since the game duration is very short I do not want to check every entered word by calling a PHP script, which would look that word up in the good_words
table.
Instead I'd like to download all possible words by one PHP script call before the round starts - since all letters are known.
My question is: if there is a nice SQLish way to find such words?
I.e. I could run a longer-taking script once to add a column to good_words
table, which would have same letters as in the word
columnt, but sorted alphabetically... But I still can't think of a way to match for it given a set of letters.
And doing the word matching inside of a PHP script (vs. inside the database) would probably take too long (because of bandwidth: would have to fetch every row from the database to the PHP script).
Any suggestions or insights please?
Using postgresql-8.4.13 with CentOS Linux 6.3.
UPDATE:
Other ideas I have:
a
, b
, ... , z
and whenever I store a word
into good_words
, store the letter occurences there. I wonder if it is possible to create an insert trigger in Pl/PgSQL for that?Nice question, I upvoted.
What you're up to is a list of all possible permutations of the given letters of a given length. As described in the PostgreSQL wiki, you can create a function and call it like this (matches highlighted letters in your screenshot):
SELECT * FROM permute('{E,R,O,M}'::text[]);
Now, to query the good_words
use something like:
SELECT gw.word, gw.stamp
FROM good_words gw
JOIN permute('{E,R,O,M}'::text[]) s(w) ON gw.word=array_to_string(s.w, '');
This could be a start, except that it doesn't check if we have enough letters, only if he have the right letters.
SELECT word from
(select word,generate_series(0,length(word)) as s from good_words) as q
WHERE substring(word,s,1) IN ('t','h','e','l','e','t','t','e','r','s')
GROUP BY word
HAVING count(*)>=length(word);
http://sqlfiddle.com/#!1/2e3a2/3
EDIT:
This query select only the valid words though it seems a bit redundant. It's not perfect but certainly proves it can be done.
WITH words AS
(SELECT word, substring(word,s,1) as sub from
(select word,generate_series(1,length(word)) as s from good_words) as q
WHERE substring(word,s,1) IN ('t','e','s','e','r','e','r','o','r','e','m','a','s','d','s','s'))
SELECT w.word FROM
(
SELECT word,words.sub,count(DISTINCT s) as cnt FROM
(SELECT s, substring(array_to_string(l, ''),s,1) as sub FROM
(SELECT l, generate_subscripts(l,1) as s FROM
(SELECT ARRAY['t','e','s','e','r','e','r','o','r','e','m','a','s','d','s','s'] as l)
as q)
as q) as let JOIN
words ON let.sub=words.sub
GROUP BY words.word,words.sub) as let
JOIN
(select word,sub,count(*) as cnt from words
GROUP BY word, sub)
as w ON let.word=w.word AND let.sub=w.sub AND let.cnt>=w.cnt
GROUP BY w.word
HAVING sum(w.cnt)=length(w.word);
Fiddle with all possible 3+ letters words (485) for that image: http://sqlfiddle.com/#!1/2fc66/1 Fiddle with 699 words out of which 485 are correct: http://sqlfiddle.com/#!1/4f42e/1
Edit 2: We can use array operators like so to get a list of words that contain the letters we want:
SELECT word as sub from
(select word,generate_series(1,length(word)) as s from good_words) as q
GROUP BY word
HAVING array_agg(substring(word,s,1)) <@ ARRAY['t','e','s','e','r','e','r','o','r','e','m','a','s','d','s','s'];
So we can use it to narrow down the list of words we need to check.
WITH words AS
(SELECT word, substring(word,s,1) as sub from
(select word,generate_series(1,length(word)) as s from
(
SELECT word from
(select word,generate_series(1,length(word)) as s from good_words) as q
GROUP BY word
HAVING array_agg(substring(word,s,1)) <@ ARRAY['t','e','s','e','r','e','r','o','r','e','m','a','s','d','s','s']
)as q) as q)
SELECT DISTINCT w.word FROM
(
SELECT word,words.sub,count(DISTINCT s) as cnt FROM
(SELECT s, substring(array_to_string(l, ''),s,1) as sub FROM
(SELECT l, generate_subscripts(l,1) as s FROM
(SELECT ARRAY['t','e','s','e','r','e','r','o','r','e','m','a','s','d','s','s'] as l)
as q)
as q) as let JOIN
words ON let.sub=words.sub
GROUP BY words.word,words.sub) as let
JOIN
(select word,sub,count(*) as cnt from words
GROUP BY word, sub)
as w ON let.word=w.word AND let.sub=w.sub AND let.cnt>=w.cnt
GROUP BY w.word
HAVING sum(w.cnt)=length(w.word) ORDER BY w.word;
http://sqlfiddle.com/#!1/4f42e/44
We can use GIN indexes to work on arrays so we probably could create a table that would store the arrays of letters and make words point to it (act, cat and tact would all point to array [a,c,t]) so probably that would speed things up but that's up for testing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With