Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot get right results from postgre full-text search

I'm developing a simple articles website in brazilian portuguese language. The search feature is based on a full-text search, but it isn't returning expected results.

I made this on postgresql. Here is the simplified table:

Artigos
-id
-title -- article title
-intro -- article introduction
-content -- article body
-publishdate -- date of launch
-artigosts -- this will work as our fts index.

After creating the table, I ran:

UPDATE artigos SET artigosts = 
setweight(to_tsvector('pg_catalog.portuguese', coalesce(title,'')), 'A') || 
setweight(to_tsvector('pg_catalog.portuguese', coalesce(intro,'')), 'B') ||
setweight(to_tsvector('pg_catalog.portuguese', coalesce(content,'')), 'C');

CREATE INDEX artigosts_idx ON artigos USING gist (artigosts);

CREATE TRIGGER artigosts_tg 
BEFORE INSERT OR UPDATE ON artigos 
FOR EACH ROW EXECUTE PROCEDURE 
  tsvector_update_trigger('artigosts', 'pg_catalog.portuguese', 'title', 'intro', 'content');

Yes, I intend to use simple weightning on the searches. Made a index to speed up, a trigger, so I can insert and update without worry about remaking index and so on.

Well, per my understanding, it is everything fine. But results aren't. A simple example.

Let's say I have "... banco de dados ... no banco ... " as one article content. When I do:

SELECT title, intro, content FROM artigos WHERE plainto_tsquery('banco de dados') @@ artigosts;

It returns an empty set. I checked the ts_vector column and saw the predicates 'banc' and 'dad'. But I still can't understand why it does not return the row containing the mentioned article.

Can someone bring a light into this question?

like image 898
Davis Peixoto Avatar asked Oct 14 '10 17:10

Davis Peixoto


People also ask

Is PostgreSQL good for full text search?

Yes, You Can Keep Full-Text Search in Postgres You can get even deeper and make your Postgres full-text search even more robust, by implementing features such as highlighting results, or writing your own custom dictionaries or functions.

What is FTS in PostgreSQL?

Introduction. Full-text search (FTS) is a technique used by search engines to find results in a database. It can be used to power search results on websites like shops, search engines, newspapers, and more.

What is lexemes in PostgreSQL?

A lexeme is a string, just like a token, but it has been normalized so that different forms of the same word are made alike. For example, normalization almost always includes folding upper-case letters to lower-case, and often involves removal of suffixes (such as s or es in English).


1 Answers

The reason for this is probably because your default dictionary is set to english. Try the following queries to determine if that is actually the case.

SELECT * FROM ts_debug('banco de dados');

This query will show how the dictionary parses up your search phrase. It should provide the lexemes "banco", "de", and "dado". So what you're actually searching for will not exist in the index and you will receive 0 results.

Now try this:

SELECT * FROM ts_debug('portuguese', 'banco de dados');

It should return the lexemes that exist in the index, "banc" and "dad". If that is the case then you can simply change your search query to get the appropriate result.

SELECT title, intro, content FROM artigos WHERE plainto_tsquery('portuguese', 'banco de dados') @@ artigosts;
like image 192
thetaiko Avatar answered Sep 30 '22 06:09

thetaiko