Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using RedShift CURSOR to insert and iterate

I recently found that RedShift supports CURSOR, and more specifically it supports the commands: DECLARE & FETCH. I managed to create a cursor and fetch some of its rows but didn't find a way to:

  1. Insert the fetched results into neither table or variable.
  2. Iterate over the rows of the declared cursor in a dynamic fashion (based on while or any logical test)

I didn't find any documentation on Amazon on how to do that, does someone know if that is even possible? Thanks.

like image 608
Yoav Avatar asked Jun 26 '16 20:06

Yoav


People also ask

What are redshift cursors?

As mentioned earlier, Redshift cursor variables are used to retrieve a few rows from the result set of larger query. When you want to loop through the rows of particular SELECT statement result, you can create a cursor and use that in FOR loop.

What is Ilike in redshift?

ILIKE performs a case-insensitive pattern match for single-byte UTF-8 (ASCII) characters. To perform a case-insensitive pattern match for multibyte characters, use the LOWER function on expression and pattern with a LIKE condition.

Can Python connect to redshift?

Connecting to Your Redshift Data Using Python As mentioned above, Redshift is compatible with other database solutions such as PostgreSQL. Hence, you can safely use the tools you'd use to access and query your PostgreSQL data for Redshift. We will use the psycopg Python driver to connect to our Redshift instance.


1 Answers

You can easily achieve this by creating a stored procedure which supports variables. You can read a dataset iterate through it and perform your logic.

The following example shows a procedure with output arguments. Arguments are input (IN), input and output (INOUT), and output (OUT).

CREATE OR REPLACE PROCEDURE test_sp2(f1 IN int, 
f2 INOUT varchar(256), out_var OUT varchar(256))

AS $$
DECLARE
  loop_var int;
BEGIN
  IF f1 is null OR f2 is null THEN
    RAISE EXCEPTION 'input cannot be null';
  END IF;
  DROP TABLE if exists my_etl;
  CREATE TEMP TABLE my_etl(a int, b varchar);
    FOR loop_var IN 1..f1 LOOP
        insert into my_etl values (loop_var, f2);
        f2 := f2 || '+' || f2;
    END LOOP;
  SELECT INTO out_var count(*) from my_etl;
END;
$$ LANGUAGE plpgsql;


call test_sp2(2,'2019');

         f2          | column2
---------------------+---------
 2019+2019+2019+2019 | 2
(1 row)

source - https://docs.aws.amazon.com/redshift/latest/dg/stored-procedure-create.html

like image 52
SaM Avatar answered Sep 21 '22 00:09

SaM