I am using Tire and elasticsearch to provide search functionality on a MongoMapper model, which is part of a Rails App. I just stumbled across a problem where the mappings for this model were not being updated when I redeployed to an environment that uses the following configuration (in config/environments/env_name.rb):
config.cache_classes = true
reloading the class alone didn't seem to fix the issue (perhaps understandably, the new mappings might not be incompatible with existing data I guess?). instead I had to do the following:
MyModel.index.delete
<restart the app or reload the class>
MyModel.index.import MyModel.all
I just wondered if there's a better way of a). ensuring the latest mappings defined in my model code are being used by elasticsearch after each deployment but b). avoiding unnecessary repopulating the index with the complete dataset?
We normally deploy using Chef, so I could automate the three steps I used successfully without too much trouble. But I'm new to elasticsearch and tire so I thought it's highly likely I'm misusing both or making things unnecessarily difficult.
If the Elasticsearch security features are enabled, you must have the manage index privilege for the target data stream, index, or alias. [7.9] If the request targets an index or index alias, you can also update its mapping with the create , create_doc , index , or write index privilege.
It is not possible to update the mapping of an existing field. If the mapping is set to the wrong type, re-creating the index with updated mapping and re-indexing is the only option available. In version 7.0, Elasticsearch has deprecated the document type and the default document type is set to _doc.
Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. Each document is a collection of fields, which each have their own data type. When mapping your data, you create a mapping definition, which contains a list of fields that are pertinent to the document.
The schema in Elasticsearch is a mapping that describes the the fields in the JSON documents along with their data type, as well as how they should be indexed in the Lucene indexes that lie under the hood.
Couple of points here:
So, your question is really more about the proper workflow? When you deploy a new version of the application, you shouldn't re-populate the index, in the same way you don't re-populate the database from some kind of backup.
Automatically checking for index mappings conforming to current definition in the model is certainly possible (compare the MyModel.tire.index.mapping
with MyModel.tire.mapping
, re-populate if different, etc), it's something I'd be wary to do.
The developer usually knows when she changed the mapping and should re-index the data. Dropping the index, and re-populating also means search downtime, and isn't even feasible for large applications.
A nicer solution is to use a specific index name such as my-index-2012-12
when importing the data, and point a my-index
alias to this index. Then you can freely re-populate the index, and flip the alias when you're done, without downtime. Tire tries hard to support you in this kind of workflow (the Rake import task, etc).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With