Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Full text search matching only certain words?

Tags:

c

sql

sqlite

I have started using SqLite recently, so I am relatively new to it. I am trying to use the full text search feature to find rough matches for a chat robot. Basically I want to match as many keywords as possible, but not necessarily all of them. The results should be sorted based on how many keywords were found in the phrase and how closely ordered they are to the query. In other words the ordering doesn't have to be exact, but the closer it is, the higher the result should rank. Similarly, even if only one or two words in the phrase are found it should match, but rank higher the more of the words that are present. I have read the reference and I see the NEAR statement and the matchinfo function, as well as the example of how to use it, but I cannot figure out how to apply this knowledge to my specific problem. Does anyone have any suggestions?

Thanks in advance for your help.

like image 856
Philip Bennefall Avatar asked Jun 14 '12 16:06

Philip Bennefall


People also ask

What is the advantage of a full-text search?

Conclusion. Users searching full text are more likely to find relevant articles than searching only abstracts. This finding affirms the value of full text collections for text retrieval and provides a starting point for future work in exploring algorithms that take advantage of rapidly-growing digital archives.

What is a full text keyword search?

Full-text search refers to searching some text inside extensive text data stored electronically and returning results that contain some or all of the words from the query. In contrast, traditional search would return exact matches.

What is full text indexing?

The information in full-text indexes is used by the Full-Text Engine to compile full-text queries that can quickly search a table for particular words or combinations of words. A full-text index stores information about significant words and their location within one or more columns of a database table.

What is full-text search vs LIKE?

FTS involves indexing the individual words within a text field in order to make searching through many records quick. Using LIKE still requires you to do a string search (linear or the like) within the field.


1 Answers

I have recently been told that this is not possible on the SqLite mailing list. The closest I came to a solution was to strip out stop words like a search engine would, as well as using the porter stemmer algorithm to further generalize queries. Searching first for the full set of keywords (naturally without punctuation and similar), then searching for the same set of keywords with stemming applied, then searching for the same set but with stop words stripped, and finally searching for this same stripped subset with stemming applied, seems to give a reasonable approximation from best to worst. Of course as soon as some matches are found, the more general queries that follow in the chain above are not executed.

like image 176
Philip Bennefall Avatar answered Nov 08 '22 02:11

Philip Bennefall