Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr: changing the data directory in Windows

I'm trying to change my Solr core's data directly away from the default 'data' directory under the instance directory. I'm using an absolute path because my Solr core's instance and conf directories are buried elsewhere (inside my GitHub directory). I thought it would be as easy as specifying this in core.properties:

dataDir=C:\foo\bar\my_new_data_directory

Inside the 'my_new_data_directory' directory are the following Solr directories:

- index
- tlog

I'm using Windows and am getting the following error when starting up Solr:

ERROR - 2014-01-17 12:40:34.578; org.apache.solr.core.CoreContainer; Unable to create core: collection1
org.apache.solr.common.SolrException
    at org.apache.solr.core.SolrCore.<init>(SolrCore.java:680)
    at org.apache.solr.core.SolrCore.<init>(SolrCore.java:625)
    at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:557)
    at org.apache.solr.core.CoreContainer.create(CoreContainer.java:592)
    at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:271)
    at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:263)
    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: java.io.IOException: The filename, directory name, or volume label syntax is incorrect
    at java.io.WinNTFileSystem.canonicalize0(Native Method)
    at java.io.Win32FileSystem.canonicalize(Unknown Source)
    at java.io.File.getCanonicalPath(Unknown Source)
    at org.apache.solr.core.StandardDirectoryFactory.normalize(StandardDirectoryFactory.java:47)
    at org.apache.solr.core.DirectoryFactory.getDataHome(DirectoryFactory.java:246)
    at org.apache.solr.core.SolrCore.<init>(SolrCore.java:677)
    ... 13 more

It looks like I'm not specifying the file path properly. How is it supposed to be specified?

like image 646
Johnny Oshika Avatar asked Jan 17 '14 20:01

Johnny Oshika


People also ask

Where is Solr data stored?

Apache Solr stores the data it indexes in the local filesystem by default. HDFS (Hadoop Distributed File System) provides several benefits, such as a large scale and distributed storage with redundancy and failover capabilities. Apache Solr supports storing data in HDFS.

Where is Solr home directory?

When you first install Solr, your home directory is server/solr . However, some examples may change this location (such as, if you run bin/solr start -e cloud , your home directory will be example/cloud ). The home directory contains important configuration information and is the place where Solr will store its index.

Can we define an alternative directory to hold all index data other than default?

dataDir parameterUsed to specify an alternate directory to hold all index data other than the default ./data under the Solr home. If replication is in use, this should match the replication configuration. If this directory is not absolute, then it is relative to the directory you're in when you start SOLR.

What is config set in Solr?

Configsets are a set of configuration files used in a Solr installation: solrconfig. xml , the schema, and then resources like language files, synonyms. txt , DIH-related configuration, and others that are referenced from the config or schema.


2 Answers

I stopped using core.properties a couple of versions ago as variable substitution was not working properly, but I can do this in sorl.xml:

    <core name="core0" instanceDir="core0" dataDir="c:\temp\data" />

and the index path gets properly picked up.

like image 72
Persimmonium Avatar answered Sep 28 '22 22:09

Persimmonium


This isn't the most elegant, but the only way I could get this to work properly was to hard code the full path into solrconfig.xml like this:

<dataDir>C:/foo/bar/my_new_data_directory/core1</dataDir>

UPDATE (2014-02-14)

I've realized that I can combine a couple of approaches to get the desired result.

When starting up Solr, I can include this parameter:

data.dir=C:/foo/bar/my_new_data_directory/

Then in solrconfig.xml, I can prefix my data directory with the parameter set during startup:

<dataDir>${solr.data.dir:}core1</dataDir>

This will set the data directory to: C:/foo/bar/my_new_data_directory/core1

With this technique, I can support multiple cores without having the hard code the full path in solrconfig.xml:

C:/foo/bar/my_new_data_directory/core1
C:/foo/bar/my_new_data_directory/core2
like image 23
Johnny Oshika Avatar answered Sep 28 '22 20:09

Johnny Oshika