Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting all entries with KSQL

Is it possible to use KSQL to not only count entries of a specific column via GROUP BY but instead get an aggregate over all the entries that stream through the application?

I'm searching for something like this:

| Count all | Count id1 | count id2 |
| ---245----|----150----|----95-----|

Or more like this in KSQL:

[some timestamp] | Count all | 245   
[some timestamp] | Count id1 | 150   
[some timestamp] | Count id2 | 95   
.   
.   
.   

Thank you
- Tim

like image 864
Tim.G. Avatar asked Nov 14 '17 18:11

Tim.G.


2 Answers

You cannot have both counts for the all and count for each key in the same query. You can have two queries here, one for counting each value in the given column and another for counting all values in the given column. Let's assume you have a stream with two columns, col1 and col2. To count each value in col1 with infinite window size you can use the following query:

SELECT col1, count(*) FROM mystream1 GROUP BY col1;

To count all the rows you need to write two queries since KSQL always needs GROUP BY clause for aggregation. First you create a new column with constant value and then you can count the values in new column and since it is a constant, the count will represent the count of all rows. Here is an example:

CREATE STREAM mystream2 AS SELECT 1 AS col3 FROM mystream1;
SELECT col3, count(*) FROM mystream2 GROUP BY col3;
like image 188
Hojjat Avatar answered Sep 18 '22 18:09

Hojjat


you can do a extended describe on the stream or table to see the total messages

ksql> describe extended <stream or table name>

sample output

Local runtime statistics
------------------------
messages-per-sec:         0   total-messages:   2415888     last-message: 2019-12-06T02:29:43.005Z
like image 43
knzshirox Avatar answered Sep 19 '22 18:09

knzshirox