Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php mysql fulltext search: lucene, sphinx, or?

This is admittedly similar to (but not a duplicate of) Comparison of full text search engine - Lucene, Sphinx, Postgresql, MySQL?, however what I am looking for are specific, supported, recommendations from the benefit of experience with more than one of the available systems (there seems to be a lot of: "I've used lucene, but not sphinx", and vice a versa).

The setup: Standard LAMP (Mysql 5.0, PHP 5).

MySQL: tables are using the InnoDB engine for foreign key constraints

We are looking at indexing data, not pages. data to be indexed may be in multiple languages (utf-8 charset)

A number of the comparisons I've come across (like http://blog.evanweaver.com/articles/2008/03/17/rails-search-benchmarks/) are either not entirely applicable (ferret is a lucene port but not the same as Zend_Search_Lucene) or they are pushing their own systems/implementations (not exactly unbiased).

Some others I've come across (such as http://whatstheplot.com/blog/tag/lucene/ and http://pagetracer.com/2008/02/15/sphinx-and-lucene-search-engines-first-impressions/) provide very different results for performance of the two systems.

Also, all but ignored in much of what I've read is Xapian. Might this be worth consideration as well?

So... I'm hoping that some of you here on SO have some experience with this question and could help with some recommendations or point me in the right direction.

like image 224
Jonathan Fingland Avatar asked Jun 03 '09 02:06

Jonathan Fingland


1 Answers

One advantage of Sphinx is that you can "interpose" it between your clients and the MySQL server, and it will only "interfere" on queries specifically addressing it, transparently bouncing the others off MySQL -- see e.g this article. Whether that's an advantage in your use case, you're best placed to say!

Sorry, no real-life experience with Xapian or Lucene -- still, reading about how to deploy them, makes it sound like (to me!) as if it might be worth it only if you identified substantial advantages. Otherwise, Sphinx's "easy as pie" deployment, as a "proxy" between your clients and your MySQL server, feels like a big, substantial win to me!

like image 106
Alex Martelli Avatar answered Sep 18 '22 20:09

Alex Martelli