I use Elasticsearch for storing data sent from multiple sources outside of my system, i.e. I'm not controlling the incoming data - I just receive json document and store it. I have no logstash with its filters in the middle, only ES and Kibana. Each data source sent its own data type and all of them are stored in the same index (per tenant) but in different types. However since I cannot control the data that is sent to me, it is possible to receive documents of different types with the field having the same name and different structure.
For example, assume that I have type1 and type2 with field FLD, which is an object in both cases but the structure of this object is not the same. Specifically FLD.name is a string field in type1 but an object in type2. And in this case, when type1 data arrives it is stored successfully but when type2 data arrives, it is rejected:
failed to put mappings on indices [[myindex]], type [type2]
java.lang.IllegalArgumentException: Mapper for [FLD] conflicts with existing mapping in other types[Can't merge a non object mapping [FLD.name] with an object mapping [FLD.name]]
ES documentation clearly declare that fields with the same name in the same index in different mapping types mapped to the same field internally and must have the same mapping (see here).
My question is what can I do in this case? I'd prefer to keep all the types in the same index. Is it possible to add a unique-per-type suffix to field names or something like this? Any other solution? I'm a newbie in Elasticsearch so maybe I'm missing something simple... Thanks in advance.
So it is recommended to save one mapping type into one index. From Elasticsearch version 6.0 by default index doesn't allow multiple types per index. The better option is to always have one document type per index. The single responsibility of the index is to maintain the single document type/mapping type per index.
It is not possible to update the mapping of an existing field. If the mapping is set to the wrong type, re-creating the index with updated mapping and re-indexing is the only option available. In version 7.0, Elasticsearch has deprecated the document type and the default document type is set to _doc.
These can be accessed from your dashboard by choosing Stack Settings > Elasticsearch . The next step is to write a a curl -x get command to retrieve the mappings from Elasticsearch. You will be returned with a json output similar to the below screenshot.
When Elasticsearch detects a new field in a document, it dynamically adds the field to the type mapping by default. The dynamic parameter controls this behavior. You can explicitly instruct Elasticsearch to dynamically create fields based on incoming documents by setting the dynamic parameter to true or runtime .
There is no way to do index arbitrary JSON without pre-processing before it's indexed - not even Dynamic templates are flexible enough.
You can flatten nested objects into key-value pairs and use a Nested datatype, Multi-fields, and ignore_malformed
to index arbitrary JSON (even with type conflicts) as described here. Unfortunately, Elasticsearch can still throw an exception at query time if you try to, for example, match a string to kv_pairs.value.long
, so you'll have choose appropriate fields based on format of the value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With