Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Search engine Lucene vs Database search

I am using a MySQL database and have been using database driven search. Any advantages and disadvantages of database engines and Lucene search engine? I would like to have suggestions about when and where to use them?

like image 535
Santosh Linkha Avatar asked Jan 09 '11 10:01

Santosh Linkha


People also ask

Does Lucene use a database?

Lucene is not a database — as I mentioned earlier, it's just a Java library.

Is Google search based on Lucene?

Despite these open-source bona fides, it's still surprising to see someone at Google adopting Solr, an open-source search server based on Apache Lucene, for its All for Good site. Google is the world's search market leader by a very long stretch.

Is Lucene a search engine?

Apache Lucene™ is a high-performance, full-featured search engine library written entirely in Java.

Is Lucene still relevant?

From my experience, yes. Lucene is a "production" state of art library and Solr/Elasticsearch is very used in many scenarios. This expertise is very on demand.


2 Answers

I suggest you read Full Text Search Engines vs. DBMS. A one-liner would be: If the bulk of your use case is full text search, use Lucene. If the bulk of your use case is joins and other relational operations, use a database. You may use a hybrid solution for a more complicated use case.

like image 55
Yuval F Avatar answered Sep 19 '22 11:09

Yuval F


Use Lucene when you want to index textual Documents (of any length) and search for Text within those documents, returning a ranked list of documents that matched the search query. The classic example is search engines, like Google, that uses text indexers like Lucene to index and query the content of web pages.

The advantages of using Lucene over a database like Mysql, for indexing and searching text are:

  • for the developer - tools to analyse, parse and index textual information (e.g. stemming, plurals, synonyms, tokenisation) in multiple languages. Lucene also scales very well for text search.
  • for the user - quality search results. Lucene uses a very good similarity function (to compare the search query against each document), at the heart of which are the Cosine Similarity and Inverse Term/Document frequency. This results in good search results with very little tweaking required upfront.

Lots of useful info on Lucene here.

like image 32
Joel Avatar answered Sep 20 '22 11:09

Joel