Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr Collection vs Cores

Tags:

solr

lucene

I struggle with understanding the difference between collections and cores. If I understand it correctly, cores are multiple indexes. Collection consists of cores, so essentially they share the same logic in separation, i.e. separate cores and collections have separate end-points.

I have the following scenario. I create a backend for cloud service for several online shops. Each shop has a set of products, to which customers can add reviews. I want to index static data (product information) separately from dynamic information(reviews) so I can improve performance.

How can I best separate in Solr???

like image 639
NeatNerd Avatar asked Jun 11 '13 12:06

NeatNerd


People also ask

What are cores in Solr?

In Solr, the term core is used to refer to a single index and associated transaction log and configuration files (including the solrconfig. xml and Schema files, among others).

What is shard and replica in Solr?

Note: In Solr terminology, there is a sharp distinction between the logical parts of an index (collections, shards) and the physical manifestations of those parts (cores, replicas). In this diagram, the “logical” concepts are dashed/transparent, while the “physical” items are solid.

What is shards in Solr?

In SolrCloud, a shard is a logical partition of a collection. This partition stores part of the entire index for a collection. The number of shards you have helps to determine how many documents a single collection can contain in total, and also impacts search performance.

How does Solr Sharding work?

Solr sharding involves splitting a single Solr index into multiple parts, which may be on different machines. When the data is too large for one node, you can break it up and store it in sections by creating one or more shards, each containing a unique slice of the index.


2 Answers

From the SolrCloud Documentation

Collection: A single search index.

Shard: A logical section of a single collection (also called Slice). Sometimes people will talk about "Shard" in a physical sense (a manifestation of a logical shard)

Replica: A physical manifestation of a logical Shard, implemented as a single Lucene index on a SolrCore

Leader: One Replica of every Shard will be designated as a Leader to coordinate indexing for that Shard

SolrCore: Encapsulates a single physical index. One or more make up logical shards (or slices) which make up a collection.

Node: A single instance of Solr. A single Solr instance can have multiple SolrCores that can be part of any number of collections.

Cluster: All of the nodes you are using to host SolrCores.

So basically a Collection (Logical group) has multiple cores (physical indexes).

Also, check the discussion

like image 180
Jayendra Avatar answered Sep 28 '22 00:09

Jayendra


Core

In Solr, a core is composed of a set of configuration files, Lucene index files, and Solr’s transaction log.

a Solr core is a uniquely named, managed, and configured index running in a Solr server; a Solr server can host one or more cores. A core is typically used to separate documents that have different schemas

collection

Solr also uses the term collection, which only has meaning in the context of a Solr cluster in which a single index is distributed across multiple servers.

SolrCloud introduces the concept of a collection, which extends the concept of a uniquely named, managed, and configured index to one that is split into shards and distributed across multiple servers.

like image 35
Nanhe Kumar Avatar answered Sep 28 '22 00:09

Nanhe Kumar