Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to do aggregate functions on Google App Engine?

One of the nice things relational databases support are the aggregate functions like count, sum, avg etc. But it seems that if you are using GAE, when inserting or updating a record you must calculate and store the count, sum, avg, etc. values of the whole table. But what if you have many conditional groupings? Given a Person:

class Person {
    @Id
    Integer age;
    String city;
}

If I want

  1. the total number of persons and
  2. the average age

Is it correct that everytime I create, update or delete a person I should also calculate both aggregates and store them as separate columns in the same table. If I also want the total and average values per city, should I store these values for each city also as separate columns in the same table?

like image 433
Robert McGuinness Avatar asked Jun 14 '09 15:06

Robert McGuinness


People also ask

What is array AGG in BigQuery?

ARRAY_AGG. Returns an ARRAY of expression values. To learn more about the optional arguments in this function and how to use them, see Aggregate function calls. To learn more about the OVER clause and how to use it, see Window function calls.

How aggregate function works in SQL?

An aggregate function performs a calculation on a set of values, and returns a single value. Except for COUNT(*) , aggregate functions ignore null values. Aggregate functions are often used with the GROUP BY clause of the SELECT statement. All aggregate functions are deterministic.

Is sum an aggregate function in SQL?

The SQL Server SUM() function is an aggregate function that calculates the sum of all or distinct values in an expression. In this syntax: ALL instructs the SUM() function to return the sum of all values including duplicates.

What is String_agg in BigQuery?

BigQuery STRING_AGG function returns a value by gathering all expressions in a row and generating a concatenated string of non-null values. The value can be either STRING or BYTES. The result generally depends on the first expression taken by the BigQuery STRING_AGG function.


1 Answers

Right: to use GAE storage properly, you need to selectively de-normalize some aspects of your model, keeping "redundant" data which, in a DB in normal form, you'd recompute on the fly, such as aggregates (overall and "grouped by" ones).

However, don't add such fields to the Person table in your case -- that makes little sense! Make another PersonAggregates table with columns such as City (null/missing for overall totals), Count, TotalAges (easier to maintain: compute the average at any time as total divided by count).

like image 173
Alex Martelli Avatar answered Oct 27 '22 15:10

Alex Martelli