The query
SELECT to_tsvector('recreation') @@ to_tsquery('recreatio:*');
returns false even though 'recreati' is a prefix of 'recreation'. This seems to happen because 'recreation' is stored as its stem, 'recreat'. For example, if we deliberately break the stemming algorithm by running
SELECT to_tsvector('recreation1') @@ to_tsquery('recreatio:*');
the query returns true.
Is there a way to make the first query match?
Not sure if this answer is useful given the age of the question, but:
It seems you are right:
select ts_lexize('english_stem','recreation');
outputs
ts_lexize
-----------
{recreat}
(1 row)
and the documentation says
Also, * can be attached to a lexeme to specify prefix matching:
SELECT to_tsquery('supern:*A & star:A*B');
Such a lexeme will match any word in a tsvector that begins with the given string.
So it seems there is no way to make original query match.
One could fallback to looking for partial matches of the stems and the query, e.g. using pg_trgm
extension:
SELECT (to_tsvector('recreation creation') @@ to_tsquery('recreatio:*')) or
'recreatio:*' % any (
select trim(both '''' from regexp_split_to_table(strip(to_tsvector('recreation creation'))::text, ' '))
);
(Maybe the array of stems can be formed in a more elegant way.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With