Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redshift: Executing a dynamic query from a string

I would like to execute a dynamic SQL query stored in a string field on Amazon Redshift.

My background is mostly T-SQL relational databases. I used to build SQL statements dynamically, store them into variables and them execute them. I know Redshift can prepare and execute statements, but I wonder if it is possible to execute a query stored in a string field.

I have a piece of code that dynamically builds the code below with stats on several tables using pg_* system tables. Every column/table name is dynamically calculated. Here's an example of the query output:

SELECT h_article_id AS key, 'transport_parameters_weight_in_grams' AS col_name, COUNT(DISTINCT transport_parameters_weight_in_grams) AS count_value FROM dv.s_products GROUP BY h_article_id UNION ALL
SELECT h_article_id AS key, 'transport_parameters_width_in_mm' AS col_name, COUNT(DISTINCT transport_parameters_width_in_mm) AS count_value FROM dv.s_products GROUP BY h_article_id UNION ALL
SELECT h_article_id AS key, 'label_owner_info_communication_address' AS col_name, COUNT(DISTINCT label_owner_info_communication_address) AS count_value FROM dv.s_products GROUP BY h_article_id

I would like to input this dynamic piece of code within another query, so I can make some statistics, like so:

SELECT col_name, AVG(count_value*1.00) AS avg_count
FROM (
  'QUERY ABOVE'
) A
GROUP BY col_name;

This would ouput something like:

col_name                                avg_count
transport_parameters_weight_in_grams    1.00
transport_parameters_width_in_mm        1.00
label_owner_info_communication_address  0.60

The natural way for me to do this would be to store everything as a string in a variable and execute it. But I'm afraid Redshift does not support this.

Is there an alternative way to really build dynamic SQL code?

like image 403
anahnarciso Avatar asked Sep 05 '16 14:09

anahnarciso


3 Answers

This is possible now that we have added support for Stored Procedures. "Overview of Stored Procedures in Amazon Redshift"

For example, this stored procedure counts the rows in a table and inserts the table name and row count into another table. Both table names are provided as input.

CREATE PROCEDURE get_tbl_count(IN source_tbl VARCHAR, IN count_tbl VARCHAR) AS $$
BEGIN
EXECUTE 'INSERT INTO ' || quote_ident(count_tbl) 
        || ' SELECT ''' || source_tbl ||''', COUNT(*) FROM ' 
        || quote_ident(source_tbl) || ';' 
RETURN;
END;
$$ LANGUAGE plpgsql;

In your example the query to executed could be passed in as a string.

like image 72
Joe Harris Avatar answered Oct 05 '22 02:10

Joe Harris


I am using Postgre on Redshift, and I ran into this issue and found a solution.

I was trying to create a dynamic query, putting in my own date.

date = dt.date(2018, 10, 30)

query = ''' select * from table where date >= ''' + str(my_date) + ''' order by date '''

But, the query entirely ignores the condition when typing it this way.

However, if you use the percent sign (%), you can insert the date correctly.

The correct way to write the above statement is:

query = ''' select * from table where date >= ''' + ''' '%s' ''' % my_date + ''' order by date '''

So, maybe this is helpful, or maybe it is not. I hope it helps at least one person in my situation!

Best wishes.

like image 24
spen.smith Avatar answered Oct 05 '22 02:10

spen.smith


No. There is not a straightforward way to run dynamic built SQL code in Redshift.

You can't define SQL variables, or create stored procedures, as you would have in MS SQL Server.

You can create Python Functions in Redshift, but you would be coding in Python vs. SQL.

You can use the "PREPARE" and "EXECUTE" statements to run "pre-defined" SQL queries, but you would have to create the statements outside of the database, before passing them to the execute command. By creating the statement outside of the database, in a way defeats the purpose.... You can create any statement in your "favorite" programming language.

As I said, this SQL based, in-database dynamic SQL does not exist.

Basically, you need to run this logic in your application or using something such as AWS Data Pipeline.

like image 26
BigDataKid Avatar answered Oct 05 '22 01:10

BigDataKid