Let's say I have movie data in my ElasticSearch and I created them like this:
curl -XPUT "http://192.168.0.2:9200/movies/movie/1" -d'
{
"title": "The Godfather",
"director": "Francis Ford Coppola",
"year": 1972
}'
And I have a bunch of movies from different years. I want to copy all the movies from a particular year (so, 1972) and copy them to a new index of "70sMovies", but I couldn't see how to do that.
Basically you would take a snapshot of your existing index, restore it into a new index and then use the Delete command to delete all documents with a year other than 1972. The snapshot and restore module allows to create snapshots of individual indices or an entire cluster into a remote repository.
Indices can only be cloned if they meet the following requirements: The target index must not exist. The source index must have the same number of primary shards as the target index. The node handling the clone process must have sufficient free disk space to accommodate a second copy of the existing index.
Here are three popular methods, you use to export files from Elasticsearch to any desired warehouse or platform of your choice: Elasticsearch Export: Using Logstash-Input-Elasticsearch Plugin. Elasticsearch Export: Using Elasticsearch Dump. Elasticsearch Export: Using Python Pandas.
If the Elasticsearch security features are enabled, you must have the following index privileges for the target data stream, index, or index alias: To add or overwrite a document using the PUT /<target>/_doc/<_id> request format, you must have the create , index , or write index privilege.
Since ElasticSearch 2.3 you can now use the built in _reindex
API
for example:
POST /_reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "new_twitter"
}
}
Or only a specific part by adding a filter/query
POST /_reindex
{
"source": {
"index": "twitter",
"query": {
"term": {
"user": "kimchy"
}
}
},
"dest": {
"index": "new_twitter"
}
}
Read more: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html
The best approach would be to use elasticsearch-dump tool https://github.com/taskrabbit/elasticsearch-dump.
The real world example I used :
elasticdump \
--input=http://localhost:9700/.kibana \
--output=http://localhost:9700/.kibana_read_only \
--type=mapping
elasticdump \
--input=http://localhost:9700/.kibana \
--output=http://localhost:9700/.kibana_read_only \
--type=data
Check out knapsack: https://github.com/jprante/elasticsearch-knapsack
Once you have the plugin installed and working, you could export part of your index via query. For example:
curl -XPOST 'localhost:9200/test/test/_export' -d '{
"query" : {
"match" : {
"myfield" : "myvalue"
}
},
"fields" : [ "_parent", "_source" ]
}'
This will create a tarball with only your query results, which you can then import into another index.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With