Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to find which elasticsearch shard(s) a document is on?

I'm having trouble debugging a parent child relationship query. I'd like to know ways of debugging the issue, rather than simply posting my mapping, data, query, and asking what's wrong (but I reserve the right to do so eventually!).

To that end, a start would be checking if my children and associated parent are on the same shard. I don't trust my mapping, and I don't want to calculate which shard the documents are theoretically on, using shard = hash(routing) % number_of_primary_shards. I want a query that returns a definite answer.

like image 481
10 cls Avatar asked Apr 15 '15 16:04

10 cls


People also ask

How do I view documents in Elasticsearch?

You can view the document in two ways. The Table view displays the document fields row-by-row. The JSON (JavaScript Object Notation) view allows you to look at how Elasticsearch returns the document. The link is valid for the time the document is available in Elasticsearch.

How many shards does every index have in this Elasticsearch cluster?

By default, 5 primary shards are created per index. These 5 shards can easily fit 100-250GB of data. If you know that you generate a much smaller amount of data you should adjust the default for your cluster to 1 shard per 50GB of data per index.

What is number of shards in Elasticsearch?

A good rule-of-thumb is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.


2 Answers

In your query , you can enable explain flag and it will tell you where each of the documents are from which shard and node. You can find a sample query as follows -

{
  "explain": true,
  "query": {
    "match_all": {}
  }
}

Along with docID , index name and type name , it will also emit node ID and shard ID.

You can find samples on usage of explain API here.

like image 77
Vineeth Mohan Avatar answered Sep 19 '22 11:09

Vineeth Mohan


To debug this you don't have to check shard and node identifiers at all.

All you have to do is make sure that _parent and/or _routing fields of the child documents match the id of the parent document. Use /_search?pretty&fields=_parent,_routing&_source=true to show these fields.

To find docs with a specific _routing or _parent id just use /_search?pretty&fields=_parent,_routing&_source=true&q=id:123 OR _routing:123 OR _parent:123 This will find parent docs and child docs.

like image 30
johno Avatar answered Sep 17 '22 11:09

johno