Aggregation queries in Cassandra CQL

Tags:

We are currently evaluating Cassandra as the data store for an analytical application. The plan was to dump raw data in Cassandra and then run mainly aggregation queries over it. Looking at CQL, it does not seem to support some traditional SQL operators like:

Typical aggregation functions like average, sum, count-Distinct etc.
Groupby-having operators

I did not find anything that can help achieve the above in the documentation. Also checked if there were any hooks for providing such functions as extensions. Say like in database map-reduce in Mongodb, or user-defined-functions in Relational DBs.

People do talk about the paid Datastax Enterprise Edition, and that too achieves this not via plain Cassandra, but through separate components like Hadoop-Hive-Pig-Hadoop etc. Or there are suggestions about doing needed pre-aggregations before dumping data to the DB since Cassandra writes are fast.

It looked like too much of overheads, at least for basic stuff we need. Am I missing something fundamental here?

Would highly appreciate help on this.

927

asked May 08 '14 03:05

samantp

2 Answers

Aggregation is available in cassandra as part of CASSANDRA-4914 which is available in the 2.2.0-rc1 release.

147

answered Oct 02 '22 12:10

mikea

In one particular application we're using Cassandra for the write speed and then have the app compact the data down to a more compressed, slightly aggregated summary form. Then we run an hourly job to copy the the summary form to Postgres table. This approach doesn't score highly for elegance, but it's simple and it means that we can run ad-hoc analytic queries without having to complicate the primary data ingress path or having to build bespoke aggregation into the CQL app.

answered Oct 02 '22 11:10

0x6e6562

Related questions
                            
                                how to integrate cassandra with zookeeper to support transactions
                            
                                Why nosql with cassandra instead of mysql?
                            
                                Is the IN relation in Cassandra bad for queries?
                            
                                JNA link issue while starting cassandra RHEL 6.5
                            
                                Why can't I install Cassandra-driver
                            
                                Cassandra Datastax Driver - Connection Pool
                            
                                What is the difference between C* Cassandra Cluster and normal Cassandra Cluster?
                            
                                cassandra with scala
                            
                                Choosing a partition key for a Cassandra table -- how many is too many partitions?
                            
                                Cassandra - Cannot achieve consistency level QUORUM
                            
                                How to get tombstone count for a cql query?
                            
                                Where and Order By Clauses in Cassandra CQL
                            
                                Does Cassandra support Java 10?
                            
                                Can a primary key in Cassandra contain a collection column?
                            
                                Cassandra CQL - NoSQL or SQL
                            
                                How to configure cassandra for remote connection
                            
                                Advantages of databases like Greenplum or Vertica compared to MongoDB or Cassandra [closed]
                            
                                Why is my Cassandra node stuck with MutationStage increasing?
                            
                                Spark: How to join RDDs by time range
                            
                                Cassandra - WHERE clause with non primary key disadvantages

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With