Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimize Full-Text Search Across Multiple Tables

I have the requirement to search several different tables in my SQL Server database. And I need to sort the results based on in which table the match occurred.

The approach I've taken is shown below. However, this doesn't seem very efficient as the amount of data grows.

Can anyone suggests any tricks to optimize this?

-- Full-text query
DECLARE @FtsQuery nvarchar(100)
SET @FtsQuery = 'FORMSOF(INFLECTIONAL, detail)'

-- Maximum characters in description column
DECLARE @MaxDescription int
SET @MaxDescription = 250

SELECT 1 AS RankGroup, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) AS Description FROM Table1
    INNER JOIN CONTAINSTABLE(Table1, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table1.Id
UNION SELECT 2, FTS.Rank, Id, Title, NULL FROM Table2
    INNER JOIN CONTAINSTABLE(Table2, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table2.Id
UNION SELECT 3, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) FROM Table3
    INNER JOIN CONTAINSTABLE(Table3, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table3.Id
UNION SELECT 4, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) FROM Table4
    INNER JOIN CONTAINSTABLE(Table4, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table4.Id
UNION SELECT 5, FTS.Rank, Id, Title, LEFT([Description], @MaxDescription) FROM Table5
    INNER JOIN CONTAINSTABLE(Table5, *, @FtsQuery) AS FTS ON FTS.[KEY] = Table5.Id
ORDER BY RankGroup, Rank DESC

One idea I'd considered is to create an indexed view and then perform the search on the view. But since the view would need these UNIONs, it's hard to see how that would be any more efficient.

like image 494
Jonathan Wood Avatar asked Jan 19 '12 20:01

Jonathan Wood


1 Answers

This is a difficult issue, because CONTAINSTABLE can only search a single table's FTS index at a time. Your UNION solution above is fine as long as your performance is acceptable.

We faced the same issue of needing to efficiently search many columns from many tables in a single query. What we did was aggregate all of the data from these columns and tables into a single read-only table. Our query then only needed a single CONTAINSTABLE call

 CONTAINSTABLE(AggregatedTable, AggregatedColumn, @FtsQuery)

We have a scheduled job that runs every 5-10 minutes and incrementally aggregates any modified content from our source table into our single read-only aggregated content table.

In general it seems that using FTS in any reasonably-sized database and user load means you are always battling with performance. If you find that no matter what you do you cannot get the performance to be acceptable, you may need to investigate other technologies such as Lucene.

like image 157
Joe Alfano Avatar answered Oct 20 '22 13:10

Joe Alfano