I have to insert a json array in elastic. The accepted answer in the link suggests to insert a header-line before each json entry. The answer is 2 years old, is there a better solution out in the market? Need I edit my json file manually?
is there any way to import a json file(contains 100 documents) in elasticsearch server.?
[
{
"id":9,
"status":"This is cool."
},
...
]
ElasticSearch bulk insert/update operation 73 Counting number of documents using Elasticsearch 54 Elasticsearch Bulk Index JSON Data 5 Insert multiple documents in elasticsearch 0 Elasticsearch bulk update with same script for multiple documents
You need to use elasticsearch Bulk API. It allows you to insert multiple items with one request. Requests are POSTed to special endpoint /_bulk and look like this:
Elasticsearch is a superb platform for searching and indexing large amounts of data in real time. Setting up the service and configuring compatible tools to enhance its function is a great way to get even more benefit from it.
Examples work for Elasticsearch versions 1.x, 2.x and probably later ones too For these examples, let's assume you have an index called "myIndex" and a type called "person" having name and age attributes. Don't forget the extra newline after the last document! Note that the URL used contains both the index and the type.
OK, then there's something pretty simple you can do using a simple shell script (see below). The idea is to not have to edit your file manually, but let Python do it and create another file whose format complies with what the _bulk
endpoint expects. It does the following:
_bulk
endpoint._bulk
endpoint using a simple curl commandbulk.sh:
#!/bin/sh
# 0. Some constants to re-define to match your environment
ES_HOST=localhost:9200
JSON_FILE_IN=/path/to/your/file.json
JSON_FILE_OUT=/path/to/your/bulk.json
# 1. Python code to transform your JSON file
PYTHON="import json,sys;
out = open('$JSON_FILE_OUT', 'w');
with open('$JSON_FILE_IN') as json_in:
docs = json.loads(json_in.read());
for doc in docs:
out.write('%s\n' % json.dumps({'index': {}}));
out.write('%s\n' % json.dumps(doc, indent=0).replace('\n', ''));
"
# 2. run the Python script from step 1
python -c "$PYTHON"
# 3. use the output file from step 2 in the curl command
curl -s -XPOST $ES_HOST/index/type/_bulk --data-binary @$JSON_FILE_OUT
You need to:
bulk.sh
file and chmod it (i.e. chmod u+x bulk.sh
)./bulk.sh
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With