Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Athena - Apply filter and then compute percentiles

I'm using AWS Athena to compute some metrics. I have a dataset like this :

sessionumber 0 10 -1 10 2 -10 10

I'm trying to compute percentiles on that values but only for a subset of valid values. A valid value is a sessionnumber > 1 so I tried that :

with testfun AS 
    (SELECT filter(array_agg(sessionnumber), x -> x >= 1) as validvalues 
     FROM "mydate")

SELECT (percentiles(validvalues, 0.25) FROM testfun

But I got the following error :

SYNTAX_ERROR: line 17:10: Unexpected parameters (array(integer), double) for function approx_percentile. Expected: approx_percentile(bigint, double) , approx_percentile(bigint, bigint, double) , approx_percentile(bigint, bigint, double, double) , approx_percentile(bigint, array(double)) , approx_percentile(bigint, bigint, array(double)) , approx_percentile(double, double) , approx_percentile(double, bigint, double, double) , approx_percentile(double, bigint, double) , approx_percentile(double, array(double)) , approx_percentile(double, bigint, array(double)) , approx_percentile(real, double) , approx_percentile(real, bigint, double, double) , approx_percentile(real, bigint, double) , approx_percentile(real, array(double)) , approx_percentile(real, bigint, array(double))

I understood my error but I cannot found a way to fix with AWS Athena / PrestoDB. Is even that posssible to do a such thing ?

like image 737
alifirat Avatar asked Sep 03 '25 06:09

alifirat


1 Answers

I found how to solve it and I share it here :

WITH validValues AS 
(SELECT approx_percentile(sessionnumber, ARRAY[0.25,0.50,0.75,0.95, 0.99]) as percentiles from (SELECT sessionnumber from "20180407" where sessionnumber >= 1))

SELECT percentiles FROM testfun, validValues 
like image 158
alifirat Avatar answered Sep 04 '25 21:09

alifirat



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!