I've read some article say that RDBMS such as MySQL is not good at scalable,but NoSQL such as MongoDB can shard well. I want to know which feature that RDBMS provided make itself can not shard well.
NoSql database implementation is easy and typically uses cheap servers to manage the exploding data and transaction while RDBMS databases are expensive and it uses big servers and storage systems. So the storing and processing data cost per gigabyte in the case of NoSQL can be many times lesser than the cost of RDBMS.
In contrast, NoSQL databases are horizontally scalable, which means that they can handle increased traffic simply by adding more servers to the database. NoSQL databases have the ability to become larger and much more powerful, making them the preferred choice for large or constantly evolving data sets.
The main reason relational databases cannot scale horizontally is due to the flexibility of the query syntax. SQL allows you to add all sorts of conditions and filters on your data such that it's impossible for the database system to know which pieces of your data will be fetched until your query is executed.
Not Quite The Right Choice Most databases in NoSQL do not perform ACID transactions. Modern applications requiring these properties in their final transactions cannot find a good use of NoSQL. It does not use structured query language and are not preferred for structured data.
Most RDBMS systems guarantee the so-called ACID properties. Most of these properties boil down to consistency; every modification on your data will transfer your database from one consistent state to another consistent state.
For example, if you update multiple records in a single transaction, the database will ensure that the records involved will not be modified by other queries, as long as the transaction hasn't completed. So during the transaction, multiple tables may be locked for modification. If those tables are spread across multiple shards/servers, it'll take more time to acquire the appropriate locks, update the data and release the locks.
The CAP theorem states that a distributed (i.e. scalable) system cannot guarantee all of the following properties at the same time:
RDBMS systems guarantee consistency. Sharding makes the system tolerant to partitioning. From the theorem follows that the system can therefor not guarantee availability. That's why a standard RDBMS cannot scale very well: it won't be able to guarantee availability. And what good is a database if you can't access it?
NoSQL databases drop consistency in favor of availability. That's why they are better at scalability.
I'm not saying RDBMS systems cannot scale at all, it's just harder. This article outlines some of the possible sharding schemes, and the problems you may encounter. Most approaches sacrifice consistency, which is one of the most important features of RDBMS systems, and which prevents it from scaling.
Why NoSQL dudes and dudettes don't like joins: http://www.dbms2.com/2010/05/01/ryw-read-your-writes-consistency/
Queries involving multiple shards are complex (f.e. JOINs between tables in different shards)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With