Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Full Text Search primer? [closed]

Can anybody recommend a good book(s)/paper(s)/article(s) on Full Text Search (and maybe indexing in general). I'm pretty anal about having to understand what's happening behind the scenes in my applications, and I'm having trouble understanding why Sphinx and other external FTS's leaves MySQL/MyISAM in the dust.

like image 767
Travis Warlick Avatar asked Jun 26 '09 17:06

Travis Warlick


3 Answers

For understanding full text search from the bottom up, I recommend "Managing Gigabytes".

http://www.cs.mu.oz.au/mg/

like image 62
George Phillips Avatar answered Oct 19 '22 04:10

George Phillips


I found the postgres Full Text Search page http://www.postgresql.org/docs/8.3/static/textsearch.html very enlightening.

Especially: http://www.postgresql.org/docs/8.3/static/textsearch-intro.html

Textual search operators have existed in databases for years. PostgreSQL has ~, ~*, LIKE, and ILIKE operators for textual data types, but they lack many essential properties required by modern information systems:

  • There is no linguistic support, even for English. Regular expressions are not sufficient because they cannot easily handle derived words, e.g., satisfies and satisfy. You might miss documents that contain satisfies, although you probably would like to find them when searching for satisfy. It is possible to use OR to search for multiple derived forms, but this is tedious and error-prone (some words can have several thousand derivatives).
  • They provide no ordering (ranking) of search results, which makes them ineffective when thousands of matching documents are found.
  • They tend to be slow because there is no index support, so they must process all documents for every search.
like image 25
Christopher Avatar answered Oct 19 '22 03:10

Christopher


There is an excellent free Information Retrieval book (Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008), including text search, available free (legit) here.

like image 21
unmounted Avatar answered Oct 19 '22 03:10

unmounted