Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to 'join' two indexes in ElasticSearch

Tags:

I have two indexes that must be separated:

// index = `order_item` {     "ID": 1,     "Name": "Shoes",     "Price": 9.99,     "OrderID": 82 },{     "ID": 1,     "Name": "Hat",     "Price": 19.99,     "OrderID": 82 }  // index = `order` {     "ID": 82,     "Customer": "John Smith" } 

How would I 'join' these two tables on a search, such that it would return something along the lines of:

results = {     "ID": 1,     "Name": "Shoes",     "Price": 9.99,     "Order.ID": 82,     "Customer": "John Smith" },{     "ID": 1,     "Name": "Hat",     "Price": 19.99,     "Order.ID": 82,     "Customer": "John Smith" } 
like image 777
David542 Avatar asked Oct 25 '15 21:10

David542


People also ask

Can I join indexes in Elasticsearch?

Joining queriesedit Instead, Elasticsearch offers two forms of join which are designed to scale horizontally. Documents may contain fields of type nested . These fields are used to index arrays of objects, where each object can be queried (with the nested query) as an independent document.

Can Elasticsearch have multiple indexes?

Elasticsearch features a powerful scale-out architecture based on a feature called Sharding. As document volumes grow for a given index, users can add more shards without changing their applications for the most part. Another option available to users is the use of multiple indexes.


1 Answers

As answered in your other question, nothing prevents you from storing the Customer name inside each order_item document at indexing time, while still having a dedicated index orders also containing the Customer data. Remember that it's all about cleverly denormalizing your data so that each of your documents be as "self-contained" as you need.

curl -XPUT localhost:9200/order_items/order_item/1 -d '{     "ID": 1,     "Name": "Shoes",     "Price": 9.99,     "OrderID": 82,     "Customer": "John Smith" }'  curl -XPUT localhost:9200/order_items/order_item/2 -d '{     "ID": 2,     "Name": "Hat",     "Price": 19.99,     "OrderID": 82,     "Customer": "John Smith" } 

The advantages of this solution is that each order item is completely self-contained, and you can group/aggregate them on OrderID in order to get all items of a given order.

Also, as @JohnAment mentioned in his comment, the order/order_item use case is also a good candidate for using either

  1. the parent/child relationship
  2. or nested objects.

In the first case, you'd have one order "parent" document...

curl -XPUT localhost:9200/orders/order/82 -d '{     "ID": 82,     "Customer": "John Smith" }' 

And several order_item "children" documents that you index using their parent ID:

curl -XPUT localhost:9200/order_items/order_item/1?parent=82 -d '{      "ID": 1,      "Name": "Shoes",      "Price": 9.99 }' curl -XPUT localhost:9200/order_items/order_item/2?parent=82 -d '{      "ID": 2,      "Name": "Hat",      "Price": 19.99 }' 

In the second case, your order document would contain all order items in a nested OrderItems property and would look like this:

curl -XPUT localhost:9200/orders/order/82 -d '{     "ID": 82,     "Customer": "John Smith",     "OrderItems": [       {         "ID": 1,         "Name": "Shoes",         "Price": 9.99       },{         "ID": 2,         "Name": "Hat",         "Price": 19.99       }     ] }' 
like image 163
Val Avatar answered Oct 05 '22 03:10

Val