Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what does _doc represents in elasticsearch?

I am still trying to figure out what does _doc represents in elasticsearch? From the documentation, two places I could find it's usage are:

  1. While in sorting, it is recommended that _doc is better because the benefit of sorting by _doc is that elasticsearch can efficiently skip to the next matching document when moving to the next page (it will simply ignore all docs that have a smaller doc id than the last returned document). Source

  2. Another reference to _doc was mentioned in this git request which talks putting field name against _doc.

Can someone exactly tell what is _doc actually?

like image 431
piyushGoyal Avatar asked Mar 02 '16 12:03

piyushGoyal


People also ask

What does _DOC mean in Elasticsearch?

The keyword _doc for sorting is new in Elasticsearch 2 and is a replacement for the old scan and scroll way to efficiently paginate deep into the results of a query.

What are Elasticsearch indices?

In Elasticsearch, an index (plural: indices) contains a schema and can have one or more shards and replicas. An Elasticsearch index is divided into shards and each shard is an instance of a Lucene index. Indices are used to store the documents in dedicated data structures corresponding to the data type of fields.

How do I get Elasticsearch index data?

You can use the search API to search and aggregate data stored in Elasticsearch data streams or indices. The API's query request body parameter accepts queries written in Query DSL. The following request searches my-index-000001 using a match query. This query matches documents with a user.id value of kimchy .


2 Answers

_doc is a mapping type, which by the way is now deprecated.

A mapping type used to be a separate collection inside the same index. E.g. a twitter index could have a mapping of type user for storing all users, and a mapping of type tweet to store all tweets. Both of these types still belong to the same index, so you could search inside multiple types in the same index.

Since elaticsearch came out with the news to deprecate mapping types for several reasons, they forced v6 users to ONLY use 1 mapping type per index i.e. you can have either user or tweet inside the twitter index, but not both. They further recommended to be consistent and use _doc as the name of the mapping type. But this can literally be any string - dog, cat, etc. It is just recommended to be _doc because in v7 the mapping type field is completely going away. So if every index in elasticsearch only has 1 mapping type, then it would be easier to migrate to v7 because you just have to remove the mapping type and all documents would then directly come under the index.

like image 165
Rash Avatar answered Oct 13 '22 00:10

Rash


From ElasticSearch 8.x version, only _doc is supported and it is just an endpoint name, not a document type.

In 7.0, _doc represents the endpoint name instead of the document type. The _doc component is a permanent part of the path for the document index, get, and delete APIs going forward, and will not be removed in 8.0.

Elasticsearch 8.x Specifying types in requests is no longer supported. The include_type_name parameter is removed.

Schedule For Removal of Mapping Types

like image 33
j n Avatar answered Oct 13 '22 01:10

j n