Can RethinkDB handle large data sets (i.e. multiple tera bytes ) effectively to serve as DB for an analytics application ?
Disclaimer: I'm one of the founders of RethinkDB. Sorry for a longish answer -- the question is surprisingly nuanced.
RethinkDB is designed with a very flexible architecture. The architecture can scale from small instances to large clusters with large amounts of data (definitely TB+), and efficiently run a wide variety of queries (OLTP, OLAP, etc.)
However, in practice we're currently focusing on the real-time aspects of the system -- most of the optimizations we're currently doing are around the needs of real-time applications being built on top of RethinkDB. These are typically OLTP-ish workloads. We will absolutely get to optimizing OLAP-style workloads, but it isn't currently a top priority.
The best way to find out whether Rethink will work for you is to take it for a spin, and do some load-testing. You should be able to find out pretty quickly how well things work. (If you do and happen to run into issues, please let us know about them -- we'll be happy to help you out and fix any potential problems).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With