Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr AutoScaling - Add replicas on new nodes

Using Solr version 7.3.1
Starting with 3 nodes:

I have created a collection like this:

wget "localhost:8983/solr/admin/collections?action=CREATE&autoAddReplicas=true&collection.configName=my_col_config&maxShardsPerNode=1&name=my_col&numShards=1&replicationFactor=3&router.name=compositeId&wt=json" -O /dev/null

In this way I have a replica on each node.

GOAL:

  • Each shard should add a replica to new nodes joining the cluster.
  • When a node are shoot down. It should just go away.
  • Only one replica for each shard on each node.

I know that it should be possible with the new AutoScalling API but I am having a hard time finding the right syntax. The API is very new and all I can find is the documentation. Its not bad but I am missing some more examples.

This is how its looks today. There are many small shard each with a replication factor that match the numbers of nodes. Right now there are 3 nodes. enter image description here

This video was uploaded yesterday (2018-06-13) and around 30 min. into the video there is an example of the Solr.HttpTriggerListener that can be used to call any kind of service, for example an AWS Lamda to add new nodes.

enter image description here

like image 829
Martin Andersen Avatar asked Jun 13 '18 13:06

Martin Andersen


1 Answers

The short answer is that your goals are not not achievable today (till Solr 7.4).

The NodeAddedTrigger only moves replicas from other nodes to the new node in an attempt to balance the cluster. It does not support adding new replicas. I have opened SOLR-12715 to add this feature.

Similarly, the NodeLostTrigger adds new replicas on other nodes to replace the ones on the lost node. It, too, has no support for merely deleting replicas from cluster state. I have opened SOLR-12716 to address that issue. I hope to release both the enhancements in Solr 7.5.

As for the third goal:

Only one replica for each shard on each node.

To achieve this, a policy rule given in the "Limit Replica Placement" example should suffice. However, looking at the screenshot you've posted, you actually mean a (collection,shard) pair which is unsupported today. You'd need a policy rule like the following (following does not work because collection:#EACH is not supported):

{"replica": "<2", "collection": "#EACH", "shard": "#EACH", "node": "#ANY"}

I have opened SOLR-12717 to add this feature.

Thank you for these excellent use-cases. I'll recommend asking questions such as these on the solr-user mailing list because not a lot of Solr developers frequent Stackoverflow. I could only find this question because it was posted on the docker-solr project.

like image 159
Shalin Shekhar Mangar Avatar answered Oct 05 '22 10:10

Shalin Shekhar Mangar