We are using Cassandra for OLTP DB, storing DB transactions, and evaluating requirements for reporting solution.
We are evaluating using Cassandra for reporting database with flattened schema.
What are the advantages/ pitfalls for using Cassandra as reporting DB?
The OLAP system requires data from Cassandra on a periodic basis. Requirements pertaining to this scenario are: The frequency of the data copy needs to be reduced drastically. Data has to be consistent.
When you want many-to-many mappings or join tables. Cassandra doesn't support a relational schema with foreign keys and join tables. So if you want to write a lot of complex join queries, then Cassandra might not be the right database for you.
Cassandra can be used both as a data warehouse(raw data storage) and as a database (for final data storage). It depends more on the cases you want to do with the data. You even may need to have both Hadoop and Cassandra for different purposes.
So, what is Apache Cassandra? A distributed OLTP database built for high availability and linear scalability.
It's recommended to consider using Spark in conjunction to Cassandra for OLAP.
Here is a related post on stackoverflow:
Is Cassandra for OLAP or OLTP or both?
Here is a presentation for similar use case: https://www.slideshare.net/EvanChan2/breakthrough-olap-performance-with-cassandra-and-spark
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With