Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Aggregate UDFs with Python in Redshift

I managed to write a few scalar functions with Python in AmazonRedshift, i.e. taking one or a few columns as input and returning a single value based on some logic or transformation.

But is there any way to pass all the values of a numeric column(i.e. a list) in a UDF and calculate statistics on those, for example the mean or standard deviation ?

like image 311
and_apo Avatar asked Oct 25 '15 13:10

and_apo


People also ask

Can you use Python in Redshift?

In Amazon Redshift, the Python logic is pushed across the MPP system and all the scaling is handled by AWS. The Python execution in Amazon Redshift is done in parallel just as a normal SQL query, so Amazon Redshift will take advantage of all of the CPU cores in your cluster to execute your UDFs.

Can we query Redshift tables from Python?

Setting up Python Redshift Integration can help you to access and query your Amazon Redshift data with ease. However, loading data from any source to Redshift manually is a tough nut to crack.

Does Snowflake support Python UDF?

Snowflake supports UDFs written in multiple languages, including Python. Python UDFs are scalar functions; for each row passed to the UDF, the UDF returns a value. UDFs accept 0 or more parameters.

Does Redshift support user-defined functions?

You can create a custom scalar user-defined function (UDF) using either a SQL SELECT clause or a Python program. The new function is stored in the database and is available for any user with sufficient privileges to run. You run a custom scalar UDF in much the same way as you run existing Amazon Redshift functions.


1 Answers

The documentation states only scalar udf function is possible (see http://docs.aws.amazon.com/redshift/latest/dg/user-defined-functions.html).

However you may cheat if the value list is not too huge by creating a string scalar udf expecting a string list, result of LISTAGG function execution.

eg: select udfSum(listagg(val,'|')) from table;

see: http://docs.aws.amazon.com/redshift/latest/dg/r_LISTAGG.html

like image 196
Robert Chevallier Avatar answered Sep 20 '22 06:09

Robert Chevallier