I have an SQL query that does JOIN operations over 3-4 tables to get the data. Now we are shifting to elasticsearch for better performance. How can I replicate the same JOIN query using elasticsearch? I have read about parent/child documents but my data doesn't have any strict parent/child kind of data.
Elasticsearch does not support JOINs, that's the whole purpose of NoSQL technologies in the first place. There are ways to create some relationships between your data using parent/child relationships (as you've noticed), nested objects and also using a special terms lookup optimization, but that's about it.
However, in order to take the most out of Elasticsearch, the main idea is to denormalize your data as much as possible and store self-contained documents. This means that you can take a single document and it contains all the info it needs. You don't care about the fact that data is going to be duplicated.
For instance, suppose you have the following JOIN query in your SQL database for retrieving all people with their address, city and country (i.e. 4 tables):
SELECT p.first_name, p.last_name,
a.street_name, a.street_num,
c.name, c2.name
FROM person p
JOIN address a ON a.id = p.addr_id
JOIN city c ON c.id = p.city_id
JOIN country c2 ON c2.id = p.country_id
In Elasticsearch, you would create a document containing exactly the fields that are returned by the above query, i.e.
{
"first_name": "John",
"last_name": "Doe",
"street_num": 34,
"street_name": "Main Street",
"city": "San Francisco",
"country": "United States"
}
So, one way to see this, is that in Elasticsearch you're going to store the same (or very similar) set of fields that you would get as a result of running your query in your relational database.
Taking the step from RDBMS to Elasticsearch is a paradigm shift. If you are ever going to take that step, you need to think different in terms of your data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With