Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch: Using output of one query as input to another

I have a problem which requires to fetch a doc based on id from elasticsearch and use that to make another query. This works but I am forced to make two round trips to elasticsearch cluster. Can I somehow do this in one query something like query elasticsearch and use its output as an input to another query to avoid the round trip ?

Please let me know if you don't understand the issue.

like image 862
Global Warrior Avatar asked Nov 18 '14 11:11

Global Warrior


1 Answers

I would like to use this opportunity to advertise different approach to the given problem. In fact, ElasticSearch: The Definitive Guide does pretty good job on its own, I just have to quote it:

Four common techniques are used to manage relational data in Elasticsearch:

  • Application-side joins
  • Data denormalization
  • Nested objects
  • Parent/child relationships

Often the final solution will require a mixture of a few of these techniques.

Data denormalization in practice means that data gets stored in a way that one single query performs the trick that you would do before with 2 consecutive queries.

Here I will unfold the example from the aforementioned book. Suppose you have two following indices, and you wish to find all blog posts written by any person named John:

PUT /my_index/user/1
{
  "name":     "John Smith",
  "email":    "[email protected]",
  "dob":      "1970/10/24"
}

PUT /my_index/blogpost/2
{
  "title":    "Relationships",
  "body":     "It's complicated...",
  "userID":     1
}

There is no other option but to first fetch the IDs of all Johns in the database. What you could do instead is to move some of the user information on the blogpost object:

PUT /my_index/user/1
{
  "name":     "John Smith",
  "email":    "[email protected]",
  "dob":      "1970/10/24"
}

PUT /my_index/blogpost/2
{
  "title":    "Relationships",
  "body":     "It's complicated...",
  "user":     {
    "id":       1,
    "name":     "John Smith" 
  }
}

Hence enabling search on user.name of the index blogpost.

Apart from traditional ElasticSearch methods you may also consider using third-party plugins like Siren Join:

This join is used to filter one document set based on a second document set, hence its name. It is equivalent to the EXISTS() operator in SQL.

like image 159
Nikolay Vasiliev Avatar answered Sep 29 '22 21:09

Nikolay Vasiliev