I want to realize a fulltext search in postgresql combined with a kind of fuzzy search. For my testarea I followed up this article: https://blog.lateral.io/2015/05/full-text-search-in-milliseconds-with-postgresql/ and everything is working fine. But sometimes I have search cases with and without spaces in the searach string like this:
In my 'title'-column there is an entry like 'test123'. My searchstring looks like 'test 123' with a space in it. How can I get a hit in this testcase?
My search-sql-query looks like:
SELECT *
FROM test, plainto_tsquery('test:*&123:*') as q
WHERE (tsv @@ q)
result: 0 rows
So I tried to figure out if I can use pg_trgm combined with ts_vector but I can not find a solution. Do you have an idea?
When performing a full-text searches on a PostgreSQL database, you must implement the to_tsvector() and to_tsquery() functions together with a match operator @@. The match operator returns a boolean value (either t or f) when you run a keyword search (tsquery) against a tsvector document.
A fuzzy word leads to partial or unclear. This search deals with the result that does not see the accurate or perfect search. The fuzzy search of data in PostgreSQL contains different approaches. Either through extensions or algorithms. But the phenomenon is applied in both ways.
Yes, You Can Keep Full-Text Search in Postgres You can get even deeper and make your Postgres full-text search even more robust, by implementing features such as highlighting results, or writing your own custom dictionaries or functions.
Elasticsearch is faster than Postgres when it comes to searching for data. Elasticsearch is a powerful search engine that is often faster than Postgres when it comes to searching for data. Elasticsearch can be used to search for documents, images, and other data stored in a database.
As the documentation on parsing states:
...plainto_tsquery will not recognize tsquery operators, weight labels, or prefix-match labels in its input...
plainto_tsquery
and phraseto_tsquery
are convenience functions which make it easier to search by a full string, but they don't support all of the features. Use to_tsquery
instead which accepts the full search syntax:
SELECT *
FROM test, to_tsquery('testing:* & 123:*') as q
WHERE (tsv @@ q)
This function also requires you to normalize the search query in the same way you normalize the text you're searching by using to_tsvector
, but that's pretty easy with some string functions:
SELECT string_agg(lexeme || ':*', ' & ' order by positions)
FROM unnest(to_tsvector('testing 123'))
This basically gets the individual tokens from to_tsvector
, appends :*
to each, then joins them with &
to create a single string. The example above takes testing 123
and produces testing:* & 123:*
which you can then use directly with to_tsquery
to get fuzzy matching with the normalization intact.
You can combine it all together into a CTE to make it simple:
WITH search AS (
SELECT to_tsquery(string_agg(lexeme || ':*', ' & ' order by positions)) AS query
FROM unnest(to_tsvector('enter your search query here'))
)
SELECT test.*
FROM test, query
WHERE (test.tsv @@ search.query)
This assumes that the table has a tsv
column of datatype tsquery
which is pregenerated instead of creating it on every query (which is much slower). PG12+ supports generated columns which can keep this updated automatically.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With