I have documents stored in MongoDB like so:
const demoArticle = {
created: new Date(),
title: [{
language: 'english',
value: 'This is the english title'
}, {
language: 'dutch',
value: 'Dit is de nederlandse titel'
}]
}
I want to add analyzers to specific languages, which is normally specified like so:
"mappings": {
"article": {
"properties": {
"created": {
"type": "date"
},
"title.value": {
"type": "text",
"analyzer": "english"
}
}
}
}
The problem is however: depending on the language set on the child level, it should have an analyzer set according to that same language.
I've stumbled upon Dynamic Templates in ElasticSearch but I was not quite convinced this is suited for this use-case.
Any suggestions?
Elasticsearch supports two types of mappings: “Static Mapping” and “Dynamic Mapping.” We use Static Mapping to define the index and data types. However, we still need ongoing flexibility so that documents can store extra attributes.
No, if you want to use a single index, you would need to define a single mapping that combines the fields of each document type. A better way might be to define separate indices on the same cluster for each document type.
We'll support only a finite set of languages (German, English, Korean, Japanese and Chinese) since we need to set up a specific analyzer for each language. Any documents that aren't in one of our supported languages will get indexed in a default field with the standard analyzer.
I would go with option 1 (separate index per language) as suggested by the Elasticsearch documentation since it makes sure you avoid term-frequency issues. If your document contains multiple languages, you can put in multiple indices and use field collapsing query-time to avoid duplicates of the same document being returned.
Elasticsearch - Mapping. Mapping is the outline of the documents stored in an index. It defines the data type like geo_point or string and format of the fields present in the documents and rules to control the mapping of dynamically added fields.
Elasticsearch adds new fields automatically, just by indexing a document. You can add fields to the top-level mapping, and to inner object and nested fields. Use dynamic templates to define custom mappings that are applied to dynamically added fields based on the matching condition.
These include array, JSON object and nested data type. An example of nested data type is shown below &minus Indices created in Elasticsearch 7.0.0 or later no longer accept a _default_ mapping. Indices created in 6.x will continue to function as before in Elasticsearch 6.x.
If you match MongoDB object language
property to the exact name of the ES language analyzers all you would be needing than as per the recommended by Elastic way you would just add:
{
"mappings": {
"article": {
"properties": {
"created": {
"type": "date"
},
"title": {
"type": "text",
"fields": {
"english": {
"type": "text",
"analyzer": "english"
},
"dutch": {
"type": "text",
"analyzer": "dutch"
},
"bulgarian": {
"type": "text",
"analyzer": "bulgarian"
}
}
}
}
}
}
This way you have nice match on the language/analyzer
field between MongoDB and ES.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With