I am searching only by couple of fields but I want to be able to store the whole document in ES in order not to additional DB (MySQL) queries.
I tried adding index: no
, store: no
to whole objects/properties in the mapping but I'm still not sure if the fields are being indexed and add unnecessary overhead.
Let's say I've got books and each has an author. I want to search only by book title, but I want to be able to retrieve the whole document.
Is this okay:
mappings:
properties:
title:
type: string
index: analyzed
author:
type: object
index: no
store: no
properties:
first_name:
type: string
last_name:
type: string
Or should I rather do:
mappings:
properties:
title:
type: string
index: analyzed
author:
type: object
properties:
first_name:
index: no
store: no
type: string
last_name:
index: no
store: no
type: string
Or maybe I am doing it completely wrong?
And what about nested
properties that should not be indexed?
Instead of storing information as rows of columnar data, Elasticsearch stores complex data structures that have been serialized as JSON documents. When you have multiple Elasticsearch nodes in a cluster, stored documents are distributed across the cluster and can be accessed immediately from any node.
Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).
So yes: you are able to store your data in Elasticsearch and retrieve it too. It's a document store as well.
Elasticsearch indexes are just files and they effectively cached in RAM by system. Usually if you have enough RAM Elasticsearch should work as fast as possible, especially for GET queries.
By default the _source
of the document is stored regardless of the fields that you choose to index. The _source
is used to return the document in the search results, whereas the fields that are indexed are used for searching.
You can't set index: no
on an object to prevent all fields in an object being indexed, but you can do what you want with Dynamic Templates using path_match
property to apply the index: no
setting to every field within an object. Here is a simple example.
Create an index with your mapping that includes the dynamic templates for the author
object and the nested categories
object:
POST /shop
{
"mappings": {
"book": {
"dynamic_templates": [
{
"author_object_template": {
"path_match": "author.*",
"mapping": {
"index": "no"
}
}
},
{
"categories_object_template": {
"path_match": "categories.*",
"mapping": {
"index": "no"
}
}
}
],
"properties": {
"categories": {
"type": "nested"
}
}
}
}
}
Index a document:
POST /shop/book/1
{
"title": "book one",
"author": {
"first_name": "jon",
"last_name": "doe"
},
"categories": [
{
"cat_id": 1,
"cat_name": "category one"
},
{
"cat_id": 2,
"cat_name": "category two"
}
]
}
If you searched on the title
field with the search term book
the document would be returned. If you search on the author.first_name
or author.last_name
, there won't be a match because this fields were not indexed:
POST /shop/book/_search
{
"query": {
"match": {
"author.first_name": "jon"
}
}
}
The same would be the case for a nested query on the category fields:
POST /shop/book/_search
{
"query": {
"nested": {
"path": "categories",
"query": {
"match": {
"categories.cat_name": "category"
}
}
}
}
}
Also you can use the Luke tool to expect the Lucene index and see what fields have been indexed.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With