I am looking at CouchDB, which has a number of appealing features over relational databases including:
- intuitive REST/HTTP interface
- easy replication
- data stored as documents, rather than normalised tables
I appreciate that this is not a mature product so should be adopted with caution, but am wondering whether it is actually a viable replacement for an RDBMS (in spite of the intro page saying otherwise - http://couchdb.apache.org/docs/intro.html).
- Under what circumstances would CouchDB be a better choice of database than an RDBMS (e.g. MySQL), e.g. in terms of scalability, design + development time, reliability and maintenance.
- Are there still cases where an RDBMS is still clearly the right choice?
- Is this an either-or choice, or is a hybrid solution more likely to emerge as best practice?
Why should you use CouchDB?
The architectural design of CouchDB makes it extremely adaptable when partitioning databases and scaling data onto multiple nodes. CouchDB supports both horizontal partitioning and replication to create an easily managed solution for balancing both read and write loads during a database deployment.
When would you use a non-relational database?
Non-relational databases are often used when large quantities of complex and diverse data need to be organized. For example, a large store might have a database in which each customer has their own document containing all of their information, from name and address to order history and credit card information.
When would you use RDMS?
In general, one should consider an RDBMS if one has multi-row transactions and complex joins. In a NoSQL database like MongoDB, for example, a document (aka complex object) can be the equivalent of rows joined across multiple tables, and consistency is guaranteed within that object.
I recently attended the NoSQL conference in London and think I have a better idea now how to answer the original question. I also wrote a blog post, and there are a couple of other good ones.
Key points:
- We have accumulated probably 30 years knowledge of adminstering relational databases, so shouldn't replace them without careful consideration; non-relational data stores are less mature than relational ones, and so are inherently more risky to adopt
- There are different types of non-relational data store; some are key-value stores, some are document stores, some are graph databases
- You could use a hybrid approach, e.g. a combination of RDBMS and graph data store for a social software site
- Document data stores (e.g. CouchDB and MongoDB) are probably the closest to relational databases and provide a JSON data structure with all the fields presented hierarchically which avoids having to do table joins, and (some might argue) is an improvement on the traditional object-relational mapping that most applications currently use
- Non-relational databases support replication (including master-master); relational databases support replication too but it may not be as comprehensive as the non-relational option
- Very large sites such as Twitter, Digg and Facebook use Cassandra, which is built from the ground up to support clustering
- Relational databases are probably suitable for 90% of cases
In summary, consensus seems to be "proceed with caution".
Until someone gives a more in-depth answer, here are some pros and cons for CouchDB
Pros:
- you don't need to fit your data into one of those pesky higher-order normal forms
- you can change the "schema" of your data at any time
- your data will be indexed exactly for your queries, so you will get results in constant time.
Cons:
- you need to create views for each and every query, i.e. ad-hoc like queries (such as concatenating dynamic WHERE's and SORT's in an SQL) queries are not available.
- you will either have redundant data, or you will end up implementing join and sort logic yourself on "client-side" (e.g. sorting a many-to-many relationship on multiple fields)
Pros or Cons:
- creating your views are not as straightforward as in SQL, it's more like solving a puzzle. Depends on your type if this is a pro or a con :)