Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I configure Solr replication with multiple cores

I have Solr running with multiple cores. Because of the heavy load, I want to set up a slave containing the exact same indexes.

The documentation http://wiki.apache.org/solr/SolrReplication states "Add the replication request handler to solrconfig.xml for each core", but I only have one solrconfig.xml.

My configuration:
Config: /data/solr/web/solr/conf/config files
Data: /data/solr/data/solr/core data dirs

Is it really necessary to copy the solrconfig.xml for each core?
And where should I put these multiple solrconfig files?

solr.xml

<?xml version="1.0" encoding="UTF-8" ?>
  <solr persistent="true">
  <property name="dih.username" value="user"/>
  <property name="dih.password" value="passwd"/>
  <property name="jdbclib" value="/usr/progress/dlc102b/java"/>
  <property name="dih.dburl" value="jdbc:datadirect:openedge://172.20.7.218:31380;databaseName=easource"/> <cores adminPath="/admin/cores">
    <core instanceDir="/data/solr/web/trunk/" name="product" dataDir="/data/solr/data/trunk/product-swap">
      <property name="dih-config" value="dih-config-product.xml"/>
    </core>
    <core instanceDir="/data/solr/web/trunk/" name="product-swap" dataDir="/data/solr/data/trunk/product">
      <property name="dih-config" value="dih-config-product.xml"/>
    </core>
    <core instanceDir="/data/solr/web/trunk/" name="periodp" dataDir="/data/solr/data/trunk/periodp">
      <property name="dih.config" value="dih-config-periodp.xml"/>
    </core>
    <core instanceDir="/data/solr/web/trunk/" name="periodp-swap" dataDir="/data/solr/data/trunk/periodp-swap">
      <property name="dih.config" value="dih-config-periodp.xml"/>
    </core>
  </cores>
</solr>
like image 974
DionS Avatar asked Oct 25 '12 09:10

DionS


People also ask

Where are Solr cores stored?

Any core. properties file in any directory of your Solr installation (or in a directory under where solr_home is defined) will be found by Solr and the defined properties will be used for the core named in the file. In standalone mode, solr. xml must reside in solr_home .

How do I create a new core in Solr?

Using create command You can create multiple cores in Solr. On the left-hand side of the Solr Admin, you can see a core selector where you can select the newly created core, as shown in the following screenshot.

How does Solr replication work?

Solr replication uses the master-slave model to distribute complete copies of a master index to one or more slave servers. The master server receives all updates and all changes are made against a single master server.

What is replication factor in Solr?

The replication factor, on the other hand, dictates the number of physical copies that each shard will have. So, when replication factor is set to 1, only leader shards will be created.


1 Answers

What you need to do is copy the solr instance that you have on the slave server and configure the replication handler on the solrconfig.xml. It's best practice to have a different instanceDir directory for each core since usually every core has its own schema.xml and solrconfig.xml. Anyway you can use the same conf just configuring your solr.xml to point to the same instanceDir but a different dataDir, which you configure as dataDir in your solrconfig.xml as well:

<solr persistent="true" sharedLib="lib">
    <cores adminPath="/admin/cores">
        <core name="core0" instanceDir="core">
            <property name="dataDir" value="/data/core0" />
        </core>
        <core name="core1" instanceDir="core">
            <property name="dataDir" value="/data/core1" />
        </core>
    </cores>
</solr>

This should be your situation if you currently have multiple cores but a single solrconfig.xml.

The solrconfig.xml replication section on the slaves need to contain the url of the master, including the core name, which of course is different for each core. But you can use the placeholder ${solr.core.name} like this:

<requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="slave">
        <str name="masterUrl">http://master_host:port/solr/${solr.core.name}/replication</str>
        <str name="pollInterval">00:00:20</str>
    </lst>
</requestHandler>

In fact, some properties like solr.core.name are automatically added to the core scope and you can refer to them in your configuration. As a result, the replication section can be the same for every core if you don't have any core specific settings.

Furthermore, you could use the same config for master and slave with the following configuration and just change the value (true or false) that you assign to the environment variables enable.master and enable.slave based on what you want to do. I mean that you can use the same file, but of course it's going to be on different machines since it wouldn't make a lot of sense to have master and slaves on the same machine.

<requestHandler name="/replication" class="solr.ReplicationHandler" >
    <lst name="master">
        <str name="enable">${enable.master:false}</str>
        <str name="replicateAfter">commit</str>
    </lst>
    <lst name="slave">
        <str name="enable">${enable.slave:false}</str>
        <str name="masterUrl">http://master_host:8983/solr/${solr.core.name}/replication</str>
        <str name="pollInterval">00:00:60</str>
    </lst>
</requestHandler>
like image 76
javanna Avatar answered Nov 06 '22 00:11

javanna