Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the correct way of maintaining indices for Sunspot Solr?

I am confused about the Solr indexing mechanism. Perhaps someone can shed some light on it.

So, we have 2 rake commands: rake sunspot:solr:index and rake sunspot:solr:reindex

Here's what my index task looks like (I overrode it for Mongoid):

namespace :sunspot do
  namespace :solr do 
    desc "indexes searchable models"
    task :index => :environment do
      [Model1, Model2].each do |model|
        Sunspot.index!(model.all)
      end
    end
  end
end

As far as I understand, my definition of index is effectively reindexing the collections each time I run it.

Am I right? Does it overwrite the previous index or do I have to use reindex to drop the old and create the new indices?

I am using gems sunspot v2.0.0, sunspot_mongo v1.0.1, sunspot_solr v2.0.0

like image 645
Yevgeniy Avatar asked Apr 12 '13 18:04

Yevgeniy


1 Answers

The reindex task just calls solr_reindex on each of the models, and source for that method is below (taken from github).

   # Completely rebuild the index for this class. First removes all 
   # instances from the index, then loads records and indexes them.
   #   
   # See #index for information on options, etc.
   #   
   def solr_reindex(options = {}) 
     solr_remove_all_from_index
     solr_index(options)
   end 

If you look at the source to solr_index, the comments say that the method wil "Add/update all existing records in the Solr index."

So to answer your question, the the reindex task is essentially the same as the index task, except that the reindex task will first drop the existing index. If you have items in your index that you know shouldn't be there, you should call reindex. If you know that you only are adding or updating items in the index, you can just call index and not suffer the performance hit of dropping the index and then rebuilding it from scratch.

like image 195
eremzeit Avatar answered Oct 13 '22 01:10

eremzeit