I am just wondering why ES uses that header if the body of the request is not a json but text with multiple lines, each of which is a json. For example:
{ "create" : { "_index" : "movies", "_type" : "movie", "_id" : "135569" } }
{ "id": "135569", "title" : "Star Trek Beyond", "year":2016 , "genre":["Action", "Adventure", "Sci-Fi"] }
{ "create" : { "_index" : "movies", "_type" : "movie", "_id" : "122886" } }
{ "id": "122886", "title" : "Star Wars: Episode VII - The Force Awakens", "year":2015 , "genre":["Action", "Adventure", "Fantasy", "Sci-Fi", "IMAX"] }
{ "create" : { "_index" : "movies", "_type" : "movie", "_id" : "109487" } }
{ "id": "109487", "title" : "Interstellar", "year":2014 , "genre":["Sci-Fi", "IMAX"] }
{ "create" : { "_index" : "movies", "_type" : "movie", "_id" : "58559" } }
{ "id": "58559", "title" : "Dark Knight, The", "year":2008 , "genre":["Action", "Crime", "Drama", "IMAX"] }
{ "create" : { "_index" : "movies", "_type" : "movie", "_id" : "1924" } }
{ "id": "1924", "title" : "Plan 9 from Outer Space", "year":1959 , "genre":["Horror", "Sci-Fi"] }
This would be a valid request despite not being a well-formatted json. Is it common in RESTful interfaces to define something as application/json even if it's not? You can't even send it from Postman, only from cURL, which does not validate the body syntax.
Technically, when calling the _bulk
endpoint, the content type header should be application/x-ndjson
and not application/json
as stated in their docs
the final line of data must end with a newline character \n. Each newline character may be preceded by a carriage return \r. When sending requests to this endpoint the Content-Type header should be set to application/x-ndjson.
The reason it is not a JSON array is because when the coordinating node receives the bulk request, it can split it in several chunks simply by looking at how many lines (i.e. new line characters) there are and send each chunk to a different node for processing. If the content was JSON, the coordinating node would have to parse it all and for several megabyte bulk queries, it would have a negative impact on performance.
NDJSON is a convenient format for storing or streaming structured data that may be processed one record at a time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With