How to really reindex data in elasticsearch

Tags:

I have added new mappings (mainly not_analyzed versions of existing fields) I now have to figure out how to reindex the existing data. I have tried following the guide on elastic search website but that is just too confusing. I have also tried using plugins (elasticsearch-reindex, allegro/elasticsearch-reindex-tool). I have looked at ElasticSearch - Reindexing your data with zero downtime which is a similar question. I was hoping to not have to rely on external tools (if possible) and try and use bulk API (as with original insert)

I could easily rebuild the whole index as it's a read only data really but that wont really work in the long term if I should want to add more fields etc etc when I'm in production with it. I wondered if there was anyone who knows of an easy to understand/follow solution or steps for a relative novice to ES. I'm on version 2 and using Windows.

847

asked Nov 22 '15 18:11

metase

1 Answers

Re-indexing means to read the data, delete the data in elasticsearch and ingest the data again. There is no such thing like "change the mapping of existing data in place." All the re-indexing tools you mentioned are just wrappers around read->delete->ingest.
You can always adjust the mapping for new indices and add fields later. All the new fields will be indexed with respect to this mapping. Or use dynamic mapping if you are not in control of the new fields.
Have a look at Change default mapping of string to "not analyzed" in Elasticsearch to see how to use dynamic mapping to get not_analyzed fields of strings.

Re-indexing is very expensive. Better way is to create a new index and drop the old one. To achieve this with zero downtime, use index alias for all your customers. Think of an index called "data-version1". In steps:

create your index "data-version1" and give it an alias named "data"
only use the alias "data" in all your client applications
to update your mapping: create a new index (with the new mapping) called "data-version2" and put all your data in
to switch from version1 to version2: drop the alias "data" on version1 and create an alias "data" on version2 (or first create, then drop). the time in between those two steps your clients will have no (or double) data. but the time between dropping and creating an alias should be so short your clients shouldn't recognize it.

It's good practice to always use aliases.

159

answered Oct 05 '22 23:10

dtrv

Related questions
                            
                                Updating indexed document in Elasticsearch
                            
                                Getting elasticsearch "can not run as root" error after upgrading from SonarQube 6.5 to 6.6. Nothing else changed
                            
                                Representing a Kibana query in a REST, curl form
                            
                                Install elasticsearch 1.1 using brew
                            
                                ElasticSearch: How to search for a value in any field, across all types, in one or more indices?
                            
                                ElasticSearch date range
                            
                                Elasticsearch Scroll
                            
                                Store Date Format in elasticsearch
                            
                                How to secure an Internet-facing Elastic Search implementation in a shared hosting environment? [closed]
                            
                                Define custom ElasticSearch Analyzer using Java API
                            
                                Exact match in elastic search query
                            
                                how edge ngram token filter differs from ngram token filter?
                            
                                Cannot construct instance of `java.time.LocalDate` - Spring boot, elasticseach, jackson
                            
                                Elasticsearch - generic facets structure - calculating aggregations combined with filters
                            
                                Can't create two Types to same index elasticsearch & Kibana
                            
                                Filtered Query in Elasticsearch Java API
                            
                                Full text search options for MongoDB setup
                            
                                Can I create a document with the update API if the document doesn't exist yet
                            
                                Logstash date parsing as timestamp using the date filter
                            
                                Nested type in Elasticsearch: "object mapping can't be changed from nested to non-nested" when indexing a document

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to really reindex data in elasticsearch

Tags:

elasticsearch

reindex

metase

People also ask

1 Answers

dtrv

Recent Activity

Donate For Us