Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra as distributed cached data store

Tags:

cassandra

Can we use Cassandra as a distributed in-memory cache database by utilizing its file level caching, key cache, and row cache?

I don't want to overload each node and I want to add more nodes to the cluster when the data grows to make this effective (to let most of my data be cached). Especially since 40% of my column families are static, and updates/insertions to other tables are not much.

The primary aim of ours is that we need an elastic realtime data store (faster around as in memory dB)

like image 823
Jobs Avatar asked Oct 30 '14 18:10

Jobs


People also ask

Can Cassandra be used as a cache?

Cassandra's built-in key and row caches can provide very efficient data caching. Already, several Cassandra users who care deeply about read performance have leveraged these caching features to effectively pry dedicated caching tools such as memcached completely out of the stack.

Can Cassandra replace Redis?

Cassandra is more focussed on giving you stability, and hence like SQL, you can store huge data sets. But, it is slower in speed than that of Redis. Redis is much faster than Cassandra, but it gets slower if you use it for huge data sets and is ideally suited for rapidly changing datasets.

How does Cassandra distribute data?

In Cassandra, data distribution and replication go together. Data is organized by table and identified by a primary key, which determines which node the data is stored on. Replicas are copies of rows. When data is first written, it is also referred to as a replica.

Is Cassandra a data store?

Cassandra is a NoSQL database, which is a key-value store. Some of the features of Cassandra data model are as follows: Data in Cassandra is stored as a set of rows that are organized into tables. Tables are also called column families.


2 Answers

Cassandra was not born for the goal but after many optimizations it has become also a tool for in-memory caching. There are a few experiments -- the most significant I know is the one reported by Netflix. In Netflix they replaced their EVCache system (whom was persisted by a Cassandra backend) with a new SSD cassandra-based cache architecture -- the results are very impressive in term of performance improvements and cost-reduction.

Before choosing Cassandra as a replacement for any cache system I'd recommend to deeply understand the usage of row-caching and key-caching. More, I've never used Datastax Enterprise but it has an interesting in memory table feature.

HTH, Carlo

like image 52
Carlo Bertuccini Avatar answered Sep 29 '22 11:09

Carlo Bertuccini


I guess you could but I don't think that's correct use-case for Cassandra. Without knowing more about your requirements, I'd recommend you have a look at products like e.g. Hazelcast which is an in-memory distributed cache and sounds more like a fit for your use-case.

like image 33
Fredrik LS Avatar answered Sep 29 '22 11:09

Fredrik LS