Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Join operation with NOSQL

I have gone through some articles regarding Bigtable and NOSQL. It is very interesting that they avoid JOIN operations.

As a basic example, let's take Employee and Department table and assume the data is spread across multiple tables / servers.

Just want to know, if data is spread across multiple servers, how do we do JOIN or UNION operations?

like image 231
Sri Avatar asked Jan 03 '10 15:01

Sri


People also ask

Does NoSQL require joins?

In case of NoSQL, data for an item (or data point) is mostly stored in one collection. However, there might be cases where we need to span multiple collections to obtain all the data we need. Thus, join queries are of crucial importance for both type of databases.

Can MongoDB do joins?

Fortunately, MongoDB Joins can be performed in MongoDB 3.2 as it introduces a new Lookup operation that can perform Join operations on Collections.

Why joins are not used in MongoDB?

One order and 8 line items are 9 rows in relational; in MongoDB we would typically just model this as a single BSON document which is an order with an array of embedded line items. So in that case, the join issue does not arise.

Can you combine SQL and NoSQL?

Meeting the unique needs of OLTP applications NewSQL databases provide a solution for applications that require highly scalable, online transaction processing (OLTP) platforms by combining the ACID guarantees of SQL-based relational database engines with the horizontal scalability of NoSQL systems.


1 Answers

When you have extremely large data, you probably want to avoid joins. This is because the overhead of an individual key lookup is relatively large (the service needs to figure out which node(s) to query, and query them in parallel and wait for responses). By overhead, I mean latency, not throughput limitation.

This makes joins suck really badly as you'd need to do a lot of foreign key lookups, which would end up going to many,many different nodes (in many cases). So you'd want to avoid this as a pattern.

If it doesn't happen very often, you could probably take the hit, but if you're going to want to do a lot of them, it may be worth "denormalising" the data.

The kind of stuff which gets stored in NoSQL stores is typically pretty "abnormal" in the first place. It is not uncommon to duplicate the same data in all sorts of different places to make lookups easier.

Additionally most nosql don't (really) support secondary indexes either, which means you have to duplicate stuff if you want to query by any other criterion.

If you're storing data such as employees and departments, you're really better off with a conventional database.

like image 112
MarkR Avatar answered Sep 23 '22 20:09

MarkR