I am still trying to figure out what does _doc represents in elasticsearch? From the documentation, two places I could find it's usage are:
While in sorting, it is recommended that _doc is better because the benefit of sorting by _doc is that elasticsearch can efficiently skip to the next matching document when moving to the next page (it will simply ignore all docs that have a smaller doc id than the last returned document). Source
Another reference to _doc was mentioned in this git request which talks putting field name against _doc.
Can someone exactly tell what is _doc actually?
The keyword _doc for sorting is new in Elasticsearch 2 and is a replacement for the old scan and scroll way to efficiently paginate deep into the results of a query.
In Elasticsearch, an index (plural: indices) contains a schema and can have one or more shards and replicas. An Elasticsearch index is divided into shards and each shard is an instance of a Lucene index. Indices are used to store the documents in dedicated data structures corresponding to the data type of fields.
You can use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API's query request body parameter accepts queries written in Query DSL. The following request searches my-index-000001 using a match query. This query matches documents with a user.id value of kimchy .
_doc
is a mapping type, which by the way is now deprecated.
A mapping type
used to be a separate collection inside the same index. E.g. a twitter
index could have a mapping of type user
for storing all users, and a mapping of type tweet
to store all tweets. Both of these types still belong to the same index, so you could search inside multiple types in the same index.
Since elaticsearch came out with the news to deprecate mapping types for several reasons, they forced v6 users to ONLY use 1 mapping type per index i.e. you can have either user
or tweet
inside the twitter
index, but not both. They further recommended to be consistent and use _doc
as the name of the mapping type. But this can literally be any string - dog, cat, etc. It is just recommended to be _doc
because in v7 the mapping type field is completely going away. So if every index in elasticsearch only has 1 mapping type, then it would be easier to migrate to v7 because you just have to remove the mapping type and all documents would then directly come under the index.
From ElasticSearch 8.x version, only _doc is supported and it is just an endpoint name, not a document type.
In 7.0, _doc represents the endpoint name instead of the document type. The _doc component is a permanent part of the path for the document index, get, and delete APIs going forward, and will not be removed in 8.0.
Elasticsearch 8.x Specifying types in requests is no longer supported. The include_type_name parameter is removed.
Schedule For Removal of Mapping Types
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With