Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hbase vs Cassandra: Which is better for a timeseries data storage?

I use my API logs to extract information like:

  • In this period of time how many are the users of my API ?
  • Or in this period of time, what type of services are called the most ?

Almost all the information I extract depend on the timestamp. Actually I use MongoDB and I added the time-stamp as an index(for 80GB, indexes size is 12GB).

A migration to cassandra or Hbase was recommended for me. And I want to know which is better for my use case:

  • Analysis for timeseries data.
  • Both good write and read performance are required.
  • Possibility of using hadoop to do my data analysis.

Thanks for sharing your point of view or your experience.

like image 719
Mouna Avatar asked Nov 21 '14 17:11

Mouna


1 Answers

Advantages of Cassandra: Cassandra generally shows better performance (though both are excellent). Cassandra is substantially easier to setup and manage from an operational stand point (though there are tools that will help either way).

Advantages of HBase: Native to the hadoop ecosystem

HBase will require you installing hadoop anyway, and you get a nice two-for-one. To use Cassandra you will probably need to go to use DataStax Enterprise, a commercial, non-open source product, OR investigate using Spark for your analytics work which has an open-source connector with Cassandra.

like image 127
mildewey Avatar answered Oct 13 '22 23:10

mildewey