I have a postgresql view that is comprised as a combination of 3 tables:
create view search_view as
select u.first_name, u.last_name, a.notes, a.summary, a.search_index
from user as u, assessor as a, connector as c
where a.connector_id = c.id and c.user_id = u.id;
However, I need to concat tsvector fields from 2 of the 3 table into a single tsvector field in the view which provides full text search across 4 fields: 2 from one table, and 2 from another.
I've read the documentation stating that I can use the concat operator to combine two tsvector fields, but I'm not certain what this looks like syntactically, and also whether there are potential gotchas with this implementation.
I'm looking for example code that concats two tsvector fields from separate tables into a view, and also commentary on whether this is a good or bad practice in postgresql land.
I was wondering the same thing. I don't think we are supposed to be combining tsvectors from multiple tables like this. Best solution is to:
This will be my approach at least. I to have been doing lots of reading and have found that people aren't combining data from multiple tables into tsvectors. In fact I don't think this is possible, it may only be possible to use the columns of the current table when creating a tsvector.
Concatenating tsv vectors works but as per comments, index is probably not used this way (not an expert, can't say if it does or does not).
SELECT * FROM newsletters
LEFT JOIN campaigns ON newsletters.campaign_id=campaigns.id
WHERE newsletters.tsv || campaigns.tsv @@ to_tsquery(unaccent(?))
The reason why you'd want this is to search for an AND string like txt1 & txt2 & txt 3
which is very common usage scenario. If you simpy split the search by an OR WHERE campaigns.tsv @@ to_tsquery(unaccent(?)
this won't work because it will try to match all 3 tokens in both tsv column but the tokens could be in either column.
One solution which I found is to use triggers to insert and update the tsv column in table1 whenever the table2 changes, see: https://dba.stackexchange.com/questions/154011/postgresql-full-text-search-tsv-column-trigger-with-many-to-many but this is not a definitive answer and using that many triggers is error prone and hacky.
Official documentation and some tutorials also show concatenating all the wanted colums into a ts vector on the fly without using a tsv column. But it is unclear how much slower is the on-the-fly versus tsv column approach, I can't find a single benchmark or explanation about this. The documenntation simply states:
Another advantage is that searches will be faster, since it will not be necessary to redo the to_tsvector calls to verify index matches. (This is more important when using a GiST index than a GIN index; see Section 12.9.) The expression-index approach is simpler to set up, however, and it requires less disk space since the tsvector representation is not stored explicitly.
All I can tell from this is that tsv columns are probably waste of resources and just complicate things but it'd be nice to see some hard numbers. But if you can concat tsv columns like this, then I guess it's no different than doing it in a WHERE clause.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With