Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Solr indexes are not visible

Tags:

solr

We're using a training server to create solr indexes and uploading them to another (solr) server via rsync.

Until now, everything has been fine. Now, our index size on one core has increased drastically and our solr instances are refusing to read those indexes on that core. Also, they are ignoring those indexes without any exceptions. (we sure are reloading the cores or restarting tomcat after rsyncs)

ie: in solr stats, numDocs is 0 or /select?q=*:* is not returning any results..

Just to answer the question, are those indexes corrupted, we have regenerated them a couple of times. But nothing has changed. When we try to use smaller indexes, they are being read fine.

our solrconfig.xml in this core is like this; https://gist.github.com/983ebb13c895c9cccbfb

like image 808
xarion Avatar asked Aug 25 '12 15:08

xarion


People also ask

How Solr indexing works?

Solr works by gathering, storing and indexing documents from different sources and making them searchable in near real-time. It follows a 3-step process that involves indexing, querying, and finally, ranking the results – all in near real-time, even though it can work with huge volumes of data.

How does Solr index data?

By adding content to an index, we make it searchable by Solr. A Solr index can accept data from many different sources, including XML files, comma-separated value (CSV) files, data extracted from tables in a database, and files in common file formats such as Microsoft Word or PDF.

How Solr stores data?

Apache Solr stores the data it indexes in the local filesystem by default. HDFS (Hadoop Distributed File System) provides several benefits, such as a large scale and distributed storage with redundancy and failover capabilities. Apache Solr supports storing data in HDFS.


1 Answers

Copying your index using rsync is a bad idea. Your Solr server may not have completed writing files to disc when you initiate the copy operation, and you could end up with corruption. The only safe way to do this is to shut down the master (source index), shut down the slave (destination index), remove the entire content of the slave's index directory, copy the master's index across, and then restart everything.

A better approach is what was suggested by Peer Allan above - use Solr's built-in replication support. See http://wiki.apache.org/solr/SolrReplication.

like image 127
Mike Sokolov Avatar answered Oct 16 '22 15:10

Mike Sokolov