How to update multiple documents in Solr 4.5.1 with JSON? I tried this but it does not work:
POST /solr/mycore/update/json
:
{
"commit": {},
"add": {
"overwrite": true,
"doc": [{
"thumbnail": "/images/404.png",
"url": "/404.html?1",
"id": "demo:/404.html?1",
"channel": "demo",
"display_name": "One entry",
"description": "One entry is not enough."
}, {
"thumbnail": "/images/404.png",
"url": "/404.html?2",
"id": "demo:/404.html?2",
"channel": "demo",
"display_name": "Another entry",
"description": "Another entry is required."
}
]
}
}
When you index a document to solr, it will overwrite any existing document with the same <uniqueKey/> which is usually the id. So yes, it overwrites existing data. When you want to change a single field of a document you will have to reindex the whole document, as solr does not support updating of a field only.
In Solr, a commit is an action which asks Solr to "commit" those changes to the Lucene index files. By default commit actions result in a "hard commit" of all the Lucene index files to stable storage (disk).
By adding content to an index, we make it searchable by Solr. A Solr index can accept data from many different sources, including XML files, comma-separated value (CSV) files, data extracted from tables in a database, and files in common file formats such as Microsoft Word or PDF.
I understand that (at least) from versions 4.0 and older of solr, this has been fixed. Look at http://wiki.apache.org/solr/UpdateJSON.
In ./exampledocs/books.json there is an example of a json file with multiple documents.
[
{
"id" : "978-0641723445",
"cat" : ["book","hardcover"],
"name" : "The Lightning Thief",
"author" : "Rick Riordan",
"series_t" : "Percy Jackson and the Olympians",
"sequence_i" : 1,
"genre_s" : "fantasy",
"inStock" : true,
"price" : 12.50,
"pages_i" : 384
}
,
{
"id" : "978-1423103349",
"cat" : ["book","paperback"],
"name" : "The Sea of Monsters",
"author" : "Rick Riordan",
"series_t" : "Percy Jackson and the Olympians",
"sequence_i" : 2,
"genre_s" : "fantasy",
"inStock" : true,
"price" : 6.49,
"pages_i" : 304
},
...
]
While @fiskfisk answer is still a valid JSON, it is not easy to be serializable from a data structure. This one is.
Solr expects one "add"-key in the JSON-structure for each document (which might seem weird, if you think about the original meaning of the key in the object), since it maps directly to the XML format when doing the indexing - and this way you can have metadata for each document by itself.
{
"commit": {},
"add": {
"doc": {
"id": "321321",
"name": "barfoo"
}
},
"add": {
"doc": {
"id": "123123",
"name": "Foobar"
}
}
}
.. works. I think allowing an array as the element referenced by "add" would make more sense, but I haven't dug further into the source or know the reasoning behind this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With