We have an offline system where we consume input documents from external sources, transform them and store them in solr, one collection at a time.
There is a production solr instance with a different configuration than the offline solr instance (but with the same version of solr) that the data needs to be moved to once it is ready. This is set to run periodically and everytime there is new incoming data, it will replace the documents of a collection with the same name and schema in the production instance.
Is it in any way possible to do this without having to re-index the collection in the production instance? Is there some sort of back-up and restore mechanism that will allow us to copy the data, index and all, into the production system with minimal downtime?
To delete documents from the index of Apache Solr, we need to specify the ID's of the documents to be deleted between the <delete></delete> tags. Here, this XML code is used to delete the documents with ID's 003 and 005. Save this code in a file with the name delete. xml.
There is no process in Solr for programmatically reindexing data. When we say "reindex", we mean, literally, "index it again". However you got the data into the index the first time, you will run that process again.
Solr includes a simple command line tool for POSTing various types of content to a Solr server. The tool is bin/post . The bin/post tool is a Unix shell script; for Windows (non-Cygwin) usage, see the Windows section below.
After you post all your documents, call commit once manually or from SolrJ - it will take a while to commit, but this will be much faster overall. Also after you are done with your bulk import, reduce maxTime and maxDocs , so that any incremental posts you will do to Solr will get committed much sooner.
You can try making backup on one system, and a restore on the other system:
Backup:
http://localhost:8983/solr/your-collection-name/replication?command=backup&location=d:\\solr-backup
Restore:
http://localhost:8983/solr/your-collection-name/replication?command=restore&location=d:\\solr-backup
Change localhost:8983
to your server's name and port (backup on one, restore on the other), your-collection-name
to your core-name, d:\\solr-backup
is the folder on the server, where the backups will be located in (make sure, you copy the backup-data from one server to the other).
See also the solr wiki.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With