Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do multiple Solr shards on a single machine improve performance?

Does running multiple Solr shards on a single machine improve performance? I would expect Lucene to be multi-threaded, but it doesn't seem to be using more than a single core on my server with 16 physical cores. I realize this is workload dependent, but any statistics or benchmarks would be very useful!

like image 329
cberner Avatar asked Mar 23 '12 20:03

cberner


People also ask

How many shards are there in SOLR?

Best Practice: Use one shard! Shards disable Managed Solr's backup features. (Custom backups can be arranged for premium customers.) If your index can fit comfortably on one server, then use one shard. This is Solr's default behavior.

What is shards in SOLR?

In SolrCloud, a shard is a logical partition of a collection. This partition stores part of the entire index for a collection. The number of shards you have helps to determine how many documents a single collection can contain in total, and also impacts search performance.

How does SOLR Sharding work?

Solr sharding involves splitting a single Solr index into multiple parts, which may be on different machines. When the data is too large for one node, you can break it up and store it in sections by creating one or more shards, each containing a unique slice of the index.

What is shard and replica in SOLR?

Note: In Solr terminology, there is a sharp distinction between the logical parts of an index (collections, shards) and the physical manifestations of those parts (cores, replicas). In this diagram, the “logical” concepts are dashed/transparent, while the “physical” items are solid.


1 Answers

I ran some benchmarks of our search stack, and found that adding more Solr shards (on a single machine, with 16 physical cores) did improve performance up to about 8 shards (where I got a 6.5x speed up). This is on an index with ~1.5million documents, running complex range queries.

So, it seems that Solr doesn't take advantage of multiple physical cores, when running queries against a single index.

like image 80
cberner Avatar answered Sep 23 '22 03:09

cberner