In our sitecore project (6.6.0 rev. 130404), we have more than 2 million total sitecore items. We have several Lucene indexes (each for a subset of these items) configured. The issue we face is the time it takes to freshly rebuild these indexes. Specially with the Sitecore's QuickSearch index, it might take nearly a full day to rebuild that index, in addition to our custom indexes.
What are the usual practices followed with maintenance of large sitecore indexes in day-to-day operations? How often would you need to rebuild indexes? and when you do, how to cope with long website down-time (index rebuilding makes the index offline)?
If you have multiple servers you can take out one of them from the load balancer (or stop it from delivering content in any other way) and rebuild the index on this server. While it's done, just put it back in load balancer.
You can also try to use Sitecore Lucene Refresher.
Take a look at how to maintain sitecore lucene indexes in huge content delivery webfarm for more options.
One way I think of is that, you can break down your indexes based on sections/pages/content on your site, depends on what kind a data/structure you have, and how it makes sense to break them down in sections, so you will have 2-20,30, 40 etc... indexes which they can be configured by using <root>
tag for each index in ADBC configuration, that way, you should know already in what part of the site you made the updates, and you can hit the rebuild only on that index if is needed.
In general, you don't need to rebuild all the indexes all the time, as far as I know/remember the publish will push the updated index properly I think, but it will not rebuild the whole thing all over again.
The only situations I think of to rebuild the whole indexes, if you installed the site in new environment, or the indexes were deleted from the server for some reason, or some similar cases....
If you upgrade to Sitecore 7.2 you can use the SwitchOnRebuildLuceneIndex index. This provides a working index while you perform index rebuild operation.
<index id="your_index" type="Sitecore.ContentSearch.LuceneProvider.SwitchOnRebuildLuceneIndex, Sitecore.ContentSearch.LuceneProvider">
...
</index>
See the post of John West for details: http://www.sitecore.net/deutsch/Community/Technical-Blogs/John-West-Sitecore-Blog/Posts/2013/05/Sitecore-7-Rebuild-Lucene-Indexes-in-Temporary-Subdirectories.aspx
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With