When to consider Solr

Tags:

I am working on an application that needs to do interesting things with search, including full-text search, hit-highlighting, faceted-search, etc...

The dataset is likely to be between 3000-10000 records with 20-30 fields on each, and is all stored in MySQL. The traffic profile of the site is likely to be on the small size of medium.

All of these requirements could be achieved (clunkily) in MySQL, but at what point (in terms of data-size and traffic levels) does it become worth looking at more focused technologies like Solr or Sphinx?

728

asked Feb 10 '11 18:02

Andy Hume

2 Answers

This question calls for a very broad answer to be answered in all aspects. There are very well certain specificas that may make one system superior to another for a special use case, but I want to cover the basics here.

I will deal entirely with Solr as an example for several search engines that function roughly the same way.

I want to start with some hard facts:

You cannot rely on Solr/Lucene as a secure database. There are a list of facts why but they mostly consist of missing recovery options, lack of acid transactions, possible complications etc. If you decide to use solr, you need to populate your index from another source like an SQL table. In fact solr is perfect for storing documents that include data from several tables and relations, that would otherwise requrie complex joins to be constructed.
Solr/Lucene provides mind blowing text-analysis / stemming / full text search scoring / fuzziness functions. Things you just can not do with MySQL. In fact full text search in MySql is limited to MyIsam and scoring is very trivial and limited. Weighting fields, boosting documents on certain metrics, score results based on phrase proximity, matching accurazy etc is very hard work to almost impossible.
In Solr/Lucene you have documents. You cannot really store relations and process. Well you can of course index the keys of other documents inside a multivalued field of some document so this way you can actually store 1:n relations and do it both ways to get n:n, but its data overhead. Don't get me wrong, its perfectily fine and efficient for a lot of purposes (for example for some product catalog where you want to store the distributors for products and you want to search only parts that are available at certain distributors or something). But you reach the end of possibilities with HAS / HAS NOT. You can almonst not do something like "get all products that are available at at least 3 distributors".
Solr/Lucene has very nice facetting features and post search analysis. For example: After a very broad search that had 40000 hits you can display that you would only get 3 hits if you refined your search to the combination of having this field this value and that field that value. Stuff that need additional queries in MySQL is done efficiently and convinient.

So let's sum up

The power of Lucene is text searching/analyzing. It is also mind blowingly fast because of the reverse index structure. You can really do a lot of post processing and satisfy other needs. Altough it's document oriented and has no "graph querying" like triple stores do with SPARQL, basic N:M relations are possible to store and to query. If your application is focused on text searching you should definitely go for Solr/Lucene if you haven't good reasons, like very complex, multi-dmensional range filter queries, to do otherwise.
If you do not have text-search but rather something where you can point and click something but not enter text, good old relational databases are probably a better way to go.

159

answered Oct 03 '22 01:10

The Surrican

Use Solr if:

You do not want to stress your database.
Get really full text search.
Perform lightning fast search results.

I currently maintain a news website with 5 million users per month, with MySQL as the main datastore and Solr as the search engine.

answered Oct 03 '22 01:10

Fernando Garza

Related questions
                            
                                Django : Table doesn't exist
                            
                                "No such file or directory" or "No such host is known" when running migrations
                            
                                phpmyadmin enable drop database statement
                            
                                Get all the users except current logged in user in laravel eloquent
                            
                                Select most common value from a field in MySQL
                            
                                How to truncate a table using Doctrine 2?
                            
                                How do I get next month date from today's date and insert it in my database?
                            
                                Docker - Is There Any Difference Between The Two MySQL Docker Images?
                            
                                Is there a way to 'listen' for a database event and update a page in real time?
                            
                                MySQL alias for SELECT * columns
                            
                                UNIQUE constraint vs checking before INSERT
                            
                                PHP PDO prepared statements
                            
                                Offset MySQL Without Limit [duplicate]
                            
                                When should I use transactions in my queries?
                            
                                How do you select a column using Hibernate?
                            
                                MySQL get first non null value after group by
                            
                                MySQL grant all privileges to database except one table
                            
                                PHP PDO and MySQLi [duplicate]
                            
                                how to compute similarity between two strings in MYSQL
                            
                                mysql stored-procedure: out parameter

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When to consider Solr

Tags:

performance

mysql

solr

Andy Hume

People also ask

2 Answers

The Surrican

Fernando Garza

Recent Activity

Donate For Us