i have a given document:
{
"foo": {}
}
where foo
can have an arbitrary amount of properties. lets assume i will import millions of documents into my index, in which each of foo
's properties do have other values.
that means my mapping which will be built dynamically will grow enormous. is there any kind of way where i can tell elasticsearch something like
take everything you have in
foo
and just accept it as it is (or stringifyfoo
) without having a resulting million-lines-mapping???
or do i have to care by myself, before indexing documents?
if so, there a 2 solutions i think
JSON.stringify
foo
map every property in foo
into key/value pairs, and create an array of objects:
// object
{
"foo": [
{"key": "bar1", "value": "bar1's value"},
{"key": "bar2", "value": "bar2's value"}
]
}
// resulting mapping
{
"type": {
"properties": {
"foo": {
"properties": {
"key": {
"type": "string"
},
"value": {
"type": "string"
}
}
}
}
}
}
would you prefer then solution 1 or 2, and why?
appreciate your help!
The default value is 1000 . The limit is in place to prevent mappings and searches from becoming too large. Higher values can lead to performance degradations and memory issues, especially in clusters with a high load or few resources.
Mapping is the process of defining how a document, and the fields it contains, are stored and indexed. Each document is a collection of fields, which each have their own data type. When mapping your data, you create a mapping definition, which contains a list of fields that are pertinent to the document.
From Elasticsearch version 6.0 by default index doesn't allow multiple types per index. The better option is to always have one document type per index. The single responsibility of the index is to maintain the single document type/mapping type per index.
These can be accessed from your dashboard by choosing Stack Settings > Elasticsearch . The next step is to write a a curl -x get command to retrieve the mappings from Elasticsearch. You will be returned with a json output similar to the below screenshot.
You cannot make Elasticsearch stringify it for you. You can make it ignore whatever's under "foo" and it'll be a part of "_source", but then it will not be searchable at all.
The second approach can make a lot of sense, depending on how you're going to query it, and what you can know about the kinds of values you will accept.
There is a related question at Dynamic Type with Mappings describing this approach, with a runnable example here: https://www.found.no/play/gist/7596633
The idea is that you'll have a nested document per value. This works well if the number of values per document is not huge. If you don't use nested documents, your document would be returned if you search for "key": "bar1" and "value": "bar2's value".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With