Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use dynamoDB -UseCases

I've tried to figure out what will be the best use cases that suit for Amazon dynamoDB.

When I googled most of the blogs says DyanmoDb will be used only for a large amount of data (BigData).

I'm having a background of relational DB. NoSQL DB is new for me.So when I've tried to relate this to normal relation DB knowledge.

Most of the concepts related to DynamoDb is to create a schema-less table with partition keys/sort keys. And try to query them based on the keys.Also, there is no such concept of stored procedure which makes queries easier and simple.

If we managing such huge Data's doing such complex queries each and every time to retrieve data will be the correct approach without a stored procedure?

Note: I've maybe had a wrong understanding of the concept. So, please anyone clear my thoughts here

Thanks in advance
Jay

like image 677
Jayendran Avatar asked Nov 30 '17 16:11

Jayendran


1 Answers

In short, systems like DynamoDB are designed to support big data sets (too big to fit a single server) and high write/read throughput by scaling horizontally, as opposed to scaling vertically, which is the more common approach for relational databases historically.

The main approach to support horizontal scalability is by partitioning data, i.e. a data set is split into multiple pieces and distributed among multiple servers. This way it may use more storage and more IOPS, allowing bigger data sets and higher read/write throughput.

However, data partitioning makes it difficult to support complex queries, such as joins etc., as data is distributed among multiple physical servers. As for stored procedures, they are not supported for the same reason - historically the idea behind stored procedures is data locality, i.e. they run on the server near the data without network operations, however, if data is distributed among multiple servers, this benefit disappears (at least in the form of stored procedure).

Therefore the most efficient way to query data from such systems is by record key, as data partitioning is based on a key and it's easy to figure out where a record lives physically for a given key. While many such systems also support secondary indexes, they are usually restricted in some way or expensive and may not be enough to satisfy requirements in a complex software solution. A quite common approach is to have a complementary indexing/query solution (I've seen solutions based on Elasticsearch and Solr), which allows running complex queries over some fragments of records to figure out a record key, which then used to load the record.

like image 139
Egor Avatar answered Jan 04 '23 16:01

Egor