Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best text search engine for integrating with custom web app?

We have a web app that allows users to upload documents, create their own documents, and so on. Uploaded files are stored on Amazon S3, created information is stored in a MySQL database. What I'm looking for is some sort of search engine, where I feed it all of our text documents, each with a unique ID, and it builds an index or whatever. Later, I can give it search queries, and it will pull out the best matching documents (via their ID), along with snippets of matching text.

Basically we want to allow our users to search through their repository of uploaded stuffs, along with anything that other users have marked as public. The solution should run on a standard Linux server, and ideally it would be open source, but I'll also consider paid solutions if they aren't outrageously priced.

So far, I've found three potential candidates:

  1. MySQL Full Text Search - some reports I've read are that it's very slow
  2. Apache Lucene - unfortunately written in Java, but I'll use it if I have to. Supposedly fast
  3. Sphinx - doesn't seem to be as popular, ideally whatever solution I find will have lots of community support.

Please let me know if there are any other good choices that I've overlooked, or if you have experience with any of the above.

like image 481
davr Avatar asked Mar 01 '23 07:03

davr


2 Answers

Take a look at Solr. It's based on Lucene, so it's very fast, and it's really easy to use from any platform.

like image 133
Mauricio Scheffer Avatar answered Mar 05 '23 16:03

Mauricio Scheffer


Sphinx may be worth your consideration, as it works well with several common RDMS (notably MySQL)

like image 21
Marc Gear Avatar answered Mar 05 '23 15:03

Marc Gear