Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Apache Cassandra do aggregate operations?

I'm fairly new to Apache Cassandra and nosql in general.

In SQL I can do aggregate operations like:

SELECT 
  country, sum(age) / count(*) AS averageAge 
FROM people 
GROUP BY country;

This is nice because it is calculated within the DB, rather than having to move every row in the 'people' table into the client layer to do the calculation.

Is this possible in Apache Cassandra? How?

like image 948
sanity Avatar asked Jun 16 '10 11:06

sanity


People also ask

How do you aggregate in Cassandra?

Create a function that divides the total value for the selected column by the number of records. Create the user-defined aggregate to calculate the average value in the column: CREATE AGGREGATE cycling. average(int) SFUNC avgState STYPE tuple<int,bigint> FINALFUNC avgFinal INITCOND (0,0);

Can we use aggregate function in Cassandra?

In Cassandra, these aggregate functions are pre-defined or in-built functions. Aggregate functions in Cassandra work on a set of rows. Aggregate functions receive values for each row and then return one value for the whole set.

How does Apache Cassandra work?

Cassandra is an open-source NoSQL distributed database that manages large amounts of data across commodity servers. It is a decentralized, scalable storage system designed to handle vast volumes of data across multiple commodity servers, providing high availability without a single point of failure.

How does Cassandra data model work?

Data in Cassandra is stored as a set of rows that are organized into tables. Tables are also called column families. Each Row is identified by a primary key value. Data is partitioned by the primary key.


2 Answers

Cassandra is primarily a mechanism that supports fast writes and look-ups. There is no support for calculations like aggregates in SQL since it is not designed for that. I would suggest reading of popular Cassandra use-cases to get a better insight :) I have bookmarked some articles on my delicious page. Here is the link:

http://delicious.com/vibhutesagar/cassandra

like image 124
Sagar V Avatar answered Sep 18 '22 15:09

Sagar V


Using SliceRange could be thought of as Cassandra's version of LIMIT and ORDER BY.

GROUP BY, COUNT and SUM is not supported out of the box.

Taking a look at the API page from the wiki is a good start.

like image 41
Schildmeijer Avatar answered Sep 21 '22 15:09

Schildmeijer