Copy Lucene indexes between Jackrabbit repositories

Question

I have two Jackrabbit instances containing the same content. Rebuilding the Lucene index is slow, 30+ hours, and the down-time needed in the cluster is risky. Is it possible to instead just re-index one Jackrabbit then copy the Lucene index from that instance to the other?

Naively copying the Lucene index files beneath the workspace directory doesn't work. The issue appears to be that the content is indexed by document number which maps to a UUID which maps to the JCR path for the indexed node, but these UUIDs are not stable for a given path between Jackrabbit instances. (Both are actually Day CQ publisher instances populated by replication from a CQ author instance.)

I've managed to find the UUID-to-path mapping in the repository under /jcr:system/jcr:versionStorage/ but I can't see an easy way to copy this between repositories along with the Lucene index. And then I can't find the UUID->document ID mapping anywhere in the files - is this part of the Lucene index too?

Thanks for any help. I'm leaning towards just re-indexing the second instance separately and accepting the downtime but any ideas to reduce risk or the elapsed time of reindexing the cluster appreciated!

In the end we're going the re-index-them-both route: we've managed to repurpose a test instance as an extra live instance that we can drop into the farm temporarily whilst we take the other two out in turn to re-index. However I'd still be interested in hearing better ways to do this!

lo5an · Accepted Answer

That seems like a scary idea, honestly. I'm not sure there is any way to guarantee that you've got the same underlying data, even with identical content and hardware configuration.

If your performance numbers look like ours, the time to copy the entire repository is less than the time it takes to reindex. Have you considered just reindexing one repository, doing a backup/copy, and then configuring the backup/copy to be your second instance?

Copy Lucene indexes between Jackrabbit repositories

Tags:

aem

jackrabbit

crx

Rup

1 Answers

lo5an

Recent Activity

Donate For Us

Copy Lucene indexes between Jackrabbit repositories

Tags:

aem

jackrabbit

crx

Rup

1 Answers

lo5an

Related questions

Recent Activity

Donate For Us