I wrote a function that outputs a PostgreSQL <code>SELECT</code> query well formed in text form. Now I don't want to output a text anymore, but actually run the generated <code>SELECT</code> statement against the database and return the result - just like the query itself would. <h3>What I have so far:</h3> <pre class="prettyprint"><code>CREATE OR REPLACE FUNCTION data_of(integer) RETURNS text AS $BODY$ DECLARE sensors varchar(100); -- holds list of column names type varchar(100); -- holds name of table result text; -- holds SQL query -- declare more variables BEGIN -- do some crazy stuff result := 'SELECT\r\nDatahora,' || sensors || '\r\n\r\nFROM\r\n' || type || '\r\n\r\nWHERE\r\id=' || $1 ||'\r\n\r\nORDER BY Datahora;'; RETURN result; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION data_of(integer) OWNER TO postgres; </code></pre> <code>sensors</code> holds the list of column names for the table <code>type</code>. Those are declared and filled in the course of the function. Eventually, they hold values like: <ul> <li> <code>sensors</code>: <code>'column1, column2, column3'</code> Except for <code>Datahora</code> (<code>timestamp</code>) all columns are of type <code>double precision</code>. </li> <li> <code>type</code> :<code>'myTable'</code> Can be the name of one of four tables. Each has different columns, except for the common column <code>Datahora</code>. </li> </ul> Definition of the underlying tables. The variable <code>sensors</code> will hold all columns displayed here for the corresponding table in <code>type</code>. For example: If <code>type</code> is <code>pcdmet</code> then <code>sensors</code> will be <code>'datahora,dirvento,precipitacao,pressaoatm,radsolacum,tempar,umidrel,velvento'</code> The variables are used to build a <code>SELECT</code> statement that is stored in <code>result</code>. Like: <pre class="prettyprint"><code>SELECT Datahora, column1, column2, column3 FROM myTable WHERE id=20 ORDER BY Datahora; </code></pre> Right now, my function returns this statement as <code>text</code>. I copy-paste and execute it in pgAdmin or via psql. I want to automate this, run the query automatically and return the result. How can I do that?

<h3>Dynamic SQL and <code>RETURN</code> type</h3> (I saved the best for last, keep reading!) You want to execute dynamic SQL. In principal, that's simple in plpgsql with the help of <code>EXECUTE</code>. You don't need a cursor. In fact, most of the time you are better off without explicit cursors. The problem you run into: you want to return records of yet undefined type. A function needs to declare its return type in the <code>RETURNS</code> clause (or with <code>OUT</code> or <code>INOUT</code> parameters). In your case you would have to fall back to anonymous records, because number, names and types of returned columns vary. Like: <pre class="prettyprint"><code>CREATE FUNCTION data_of(integer) RETURNS SETOF record AS ... </code></pre> However, this is not particularly useful. You have to provide a column definition list with every call. Like: <pre class="prettyprint"><code>SELECT * FROM data_of(17) AS foo (colum_name1 integer , colum_name2 text , colum_name3 real); </code></pre> But how would you even do this, when you don't know the columns beforehand? You could use less structured document data types like <code>json</code>, <code>jsonb</code>, <code>hstore</code> or <code>xml</code>. See: <ul> <li>How to store a data table in database?</li> </ul> But, for the purpose of this question, let's assume you want to return individual, correctly typed and named columns as much as possible. <h3>Simple solution with fixed return type</h3> The column <code>datahora</code> seems to be a given, I'll assume data type <code>timestamp</code> and that there are always two more columns with varying name and data type. Names we'll abandon in favor of generic names in the return type. Types we'll abandon, too, and cast all to <code>text</code> since every data type can be cast to <code>text</code>. <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE OR REPLACE FUNCTION data_of(_id integer) RETURNS TABLE (datahora timestamp, col2 text, col3 text) LANGUAGE plpgsql AS $func$ DECLARE _sensors text := 'col1::text, col2::text'; -- cast each col to text _type text := 'foo'; BEGIN RETURN QUERY EXECUTE ' SELECT datahora, ' || _sensors || ' FROM ' || quote_ident(_type) || ' WHERE id = $1 ORDER BY datahora' USING _id; END $func$; </code></pre> The variables <code>_sensors</code> and <code>_type</code> could be input parameters instead. Note the <code>RETURNS TABLE</code> clause. Note the use of <code>RETURN QUERY EXECUTE</code>. That is one of the more elegant ways to return rows from a dynamic query. I use a name for the function parameter, just to make the <code>USING</code> clause of <code>RETURN QUERY EXECUTE</code> less confusing. <code>$1</code> in the SQL-string does not refer to the function parameter but to the value passed with the <code>USING</code> clause. (Both happen to be <code>$1</code> in their respective scope in this simple example.) Note the example value for <code>_sensors</code>: each column is cast to type <code>text</code>. This kind of code is very vulnerable to SQL injection. I use <code>quote_ident()</code> to protect against it. Lumping together a couple of column names in the variable <code>_sensors</code> prevents the use of <code>quote_ident()</code> (and is typically a bad idea!). Ensure that no bad stuff can be in there some other way, for instance by individually running the column names through <code>quote_ident()</code> instead. A <code>VARIADIC</code> parameter comes to mind ... <h3>Simpler since PostgreSQL 9.1</h3> With version 9.1 or later you can use <code>format()</code> to further simplify: <pre class="prettyprint lang-sql prettyprint-override"><code>RETURN QUERY EXECUTE format(' SELECT datahora, %s -- identifier passed as unescaped string FROM %I -- assuming the name is provided by user WHERE id = $1 ORDER BY datahora' ,_sensors, _type) USING _id; </code></pre> Again, individual column names could be escaped properly and would be the clean way. <h3>Variable number of columns sharing the same type</h3> After your question updates it looks like your return type has <ul> <li>a variable number of columns</li> <li>but all columns of the same type <code>double precision</code> (alias <code>float8</code>)</li> </ul> Use an <code>ARRAY</code> type in this case to nest a variable number of values. Additionally, I return an array with column names: <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE OR REPLACE FUNCTION data_of(_id integer) RETURNS TABLE (datahora timestamp, names text[], values float8[]) LANGUAGE plpgsql AS $func$ DECLARE _sensors text := 'col1, col2, col3'; -- plain list of column names _type text := 'foo'; BEGIN RETURN QUERY EXECUTE format(' SELECT datahora , string_to_array($1) -- AS names , ARRAY[%s] -- AS values FROM %s WHERE id = $2 ORDER BY datahora' , _sensors, _type) USING _sensors, _id; END $func$; </code></pre> <h3>Various complete table types</h3> To actually return all columns of a table, there is a simple, powerful solution using a polymorphic type: <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE OR REPLACE FUNCTION data_of(_tbl_type anyelement, _id int) RETURNS SETOF anyelement LANGUAGE plpgsql AS $func$ BEGIN RETURN QUERY EXECUTE format(' SELECT * FROM %s -- pg_typeof returns regtype, quoted automatically WHERE id = $1 ORDER BY datahora' , pg_typeof(_tbl_type)) USING _id; END $func$; </code></pre> Call (important!): <pre class="prettyprint"><code>SELECT * FROM data_of(NULL::pcdmet, 17); </code></pre> Replace <code>pcdmet</code> in the call with any other table name. <h3>How does this work?</h3> <code>anyelement</code> is a pseudo data type, a polymorphic type, a placeholder for any non-array data type. All occurrences of <code>anyelement</code> in the function evaluate to the same type provided at run time. By supplying a value of a defined type as argument to the function, we implicitly define the return type. PostgreSQL automatically defines a row type (a composite data type) for every table created, so there is a well defined type for every table. This includes temporary tables, which is convenient for ad-hoc use. Any type can be <code>NULL</code>. Hand in a <code>NULL</code> value, cast to the table type: <code>NULL::pcdmet</code>. Now the function returns a well-defined row type and we can use <code>SELECT * FROM data_of()</code> to decompose the row and get individual columns. <code>pg_typeof(_tbl_type)</code> returns the name of the table as object identifier type <code>regtype</code>. When automatically converted to <code>text</code>, identifiers are automatically double-quoted and schema-qualified if needed, defending against SQL injection automatically. This can even deal with schema-qualified table-names where <code>quote_ident()</code> would fail. See: <ul> <li>Table name as a PostgreSQL function parameter</li> </ul>

Refactor a PL/pgSQL function to return the output of various SELECT queries

Tags:

sql

database

postgresql

dynamic-sql

plpgsql

I wrote a function that outputs a PostgreSQL SELECT query well formed in text form. Now I don't want to output a text anymore, but actually run the generated SELECT statement against the database and return the result - just like the query itself would.

What I have so far:

CREATE OR REPLACE FUNCTION data_of(integer)   RETURNS text AS $BODY$ DECLARE    sensors varchar(100);   -- holds list of column names    type    varchar(100);   -- holds name of table    result  text;           -- holds SQL query        -- declare more variables  BEGIN       -- do some crazy stuff        result := 'SELECT\r\nDatahora,' || sensors ||       '\r\n\r\nFROM\r\n' || type ||       '\r\n\r\nWHERE\r\id=' || $1 ||'\r\n\r\nORDER BY Datahora;';        RETURN result; END; $BODY$ LANGUAGE 'plpgsql' VOLATILE; ALTER FUNCTION data_of(integer) OWNER TO postgres;

sensors holds the list of column names for the table type. Those are declared and filled in the course of the function. Eventually, they hold values like:

sensors: 'column1, column2, column3'
Except for Datahora (timestamp) all columns are of type double precision.
type :'myTable'
Can be the name of one of four tables. Each has different columns, except for the common column Datahora.

Definition of the underlying tables.

The variable sensors will hold all columns displayed here for the corresponding table in type. For example: If type is pcdmet then sensors will be 'datahora,dirvento,precipitacao,pressaoatm,radsolacum,tempar,umidrel,velvento'

The variables are used to build a SELECT statement that is stored in result. Like:

SELECT Datahora, column1, column2, column3 FROM   myTable WHERE  id=20 ORDER  BY Datahora;

Right now, my function returns this statement as text. I copy-paste and execute it in pgAdmin or via psql. I want to automate this, run the query automatically and return the result. How can I do that?

709

asked Jul 31 '12 12:07

waldyr.ar

1 Answers

Dynamic SQL and `RETURN` type

^{(I saved the best for last, keep reading!)}
You want to execute dynamic SQL. In principal, that's simple in plpgsql with the help of EXECUTE. You don't need a cursor. In fact, most of the time you are better off without explicit cursors.

The problem you run into: you want to return records of yet undefined type. A function needs to declare its return type in the RETURNS clause (or with OUT or INOUT parameters). In your case you would have to fall back to anonymous records, because number, names and types of returned columns vary. Like:

CREATE FUNCTION data_of(integer)   RETURNS SETOF record AS ...

However, this is not particularly useful. You have to provide a column definition list with every call. Like:

SELECT * FROM data_of(17) AS foo (colum_name1 integer       , colum_name2 text       , colum_name3 real);

But how would you even do this, when you don't know the columns beforehand?
You could use less structured document data types like json, jsonb, hstore or xml. See:

How to store a data table in database?

But, for the purpose of this question, let's assume you want to return individual, correctly typed and named columns as much as possible.

Simple solution with fixed return type

The column datahora seems to be a given, I'll assume data type timestamp and that there are always two more columns with varying name and data type.

Names we'll abandon in favor of generic names in the return type.
Types we'll abandon, too, and cast all to text since every data type can be cast to text.

CREATE OR REPLACE FUNCTION data_of(_id integer)   RETURNS TABLE (datahora timestamp, col2 text, col3 text)   LANGUAGE plpgsql AS $func$ DECLARE    _sensors text := 'col1::text, col2::text';  -- cast each col to text    _type    text := 'foo'; BEGIN    RETURN QUERY EXECUTE '       SELECT datahora, ' || _sensors || '       FROM   ' || quote_ident(_type) || '       WHERE  id = $1       ORDER  BY datahora'    USING  _id; END $func$;

The variables _sensors and _type could be input parameters instead.

Note the RETURNS TABLE clause.

Note the use of RETURN QUERY EXECUTE. That is one of the more elegant ways to return rows from a dynamic query.

I use a name for the function parameter, just to make the USING clause of RETURN QUERY EXECUTE less confusing. $1 in the SQL-string does not refer to the function parameter but to the value passed with the USING clause. (Both happen to be $1 in their respective scope in this simple example.)

Note the example value for _sensors: each column is cast to type text.

This kind of code is very vulnerable to SQL injection. I use quote_ident() to protect against it. Lumping together a couple of column names in the variable _sensors prevents the use of quote_ident() (and is typically a bad idea!). Ensure that no bad stuff can be in there some other way, for instance by individually running the column names through quote_ident() instead. A VARIADIC parameter comes to mind ...

Simpler since PostgreSQL 9.1

With version 9.1 or later you can use format() to further simplify:

RETURN QUERY EXECUTE format('    SELECT datahora, %s  -- identifier passed as unescaped string    FROM   %I            -- assuming the name is provided by user    WHERE  id = $1    ORDER  BY datahora'   ,_sensors, _type) USING  _id;

Again, individual column names could be escaped properly and would be the clean way.

Variable number of columns sharing the same type

After your question updates it looks like your return type has

a variable number of columns
but all columns of the same type double precision (alias float8)

Use an ARRAY type in this case to nest a variable number of values. Additionally, I return an array with column names:

CREATE OR REPLACE FUNCTION data_of(_id integer)   RETURNS TABLE (datahora timestamp, names text[], values float8[])   LANGUAGE plpgsql AS $func$ DECLARE    _sensors text := 'col1, col2, col3';  -- plain list of column names    _type    text := 'foo'; BEGIN    RETURN QUERY EXECUTE format('       SELECT datahora            , string_to_array($1)  -- AS names            , ARRAY[%s]            -- AS values       FROM   %s       WHERE  id = $2       ORDER  BY datahora'     , _sensors, _type)    USING  _sensors, _id; END $func$;

Various complete table types

To actually return all columns of a table, there is a simple, powerful solution using a polymorphic type:

CREATE OR REPLACE FUNCTION data_of(_tbl_type anyelement, _id int)   RETURNS SETOF anyelement   LANGUAGE plpgsql AS $func$ BEGIN    RETURN QUERY EXECUTE format('       SELECT *       FROM   %s  -- pg_typeof returns regtype, quoted automatically       WHERE  id = $1       ORDER  BY datahora'     , pg_typeof(_tbl_type))    USING  _id; END $func$;

Call (important!):

SELECT * FROM data_of(NULL::pcdmet, 17);

Replace pcdmet in the call with any other table name.

How does this work?

anyelement is a pseudo data type, a polymorphic type, a placeholder for any non-array data type. All occurrences of anyelement in the function evaluate to the same type provided at run time. By supplying a value of a defined type as argument to the function, we implicitly define the return type.

PostgreSQL automatically defines a row type (a composite data type) for every table created, so there is a well defined type for every table. This includes temporary tables, which is convenient for ad-hoc use.

Any type can be NULL. Hand in a NULL value, cast to the table type: NULL::pcdmet.

Now the function returns a well-defined row type and we can use SELECT * FROM data_of() to decompose the row and get individual columns.

pg_typeof(_tbl_type) returns the name of the table as object identifier type regtype. When automatically converted to text, identifiers are automatically double-quoted and schema-qualified if needed, defending against SQL injection automatically. This can even deal with schema-qualified table-names where quote_ident() would fail. See:

Table name as a PostgreSQL function parameter

answered Sep 26 '22 02:09

Erwin Brandstetter

Related questions
                            
                                Left join ON condition AND other condition syntax in Doctrine
                            
                                Using the correct, or preferable, not equal operator in MySQL
                            
                                Solution for speeding up a slow SELECT DISTINCT query in Postgres
                            
                                Replacing a full ORM (JPA/Hibernate) by a lighter solution : Recommended patterns for load/save?
                            
                                Set IDENTITY_INSERT OFF for all tables
                            
                                Performing a query on a result from another query?
                            
                                Is there an equivalent to SHA1() in MS-SQL?
                            
                                How do I speed up counting rows in a PostgreSQL table?
                            
                                check for null date in CASE statement, where have I gone wrong?
                            
                                Exclusive access could not be obtained because the database is in use
                            
                                Creating new table with SELECT INTO in SQL [duplicate]
                            
                                SQL Server: Select Top 0?
                            
                                Exporting data from SQL Server Express to CSV (need quoting and escaping)
                            
                                How to add composite primary key to table
                            
                                what is logical reads in sql server? how to reduce no of logical?
                            
                                How to write a postgresql query for getting only the date part of timestamp field, from a table
                            
                                Best way to compare dates without time in SQL Server
                            
                                Decimal(3,2) values in MySQL are always 9.99
                            
                                INSERT VALUES WHERE NOT EXISTS
                            
                                How do I catch a query exception in laravel to see if it fails?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Refactor a PL/pgSQL function to return the output of various SELECT queries

Tags:

sql

database

postgresql

dynamic-sql

plpgsql

What I have so far:

waldyr.ar

People also ask

1 Answers

Dynamic SQL and `RETURN` type

Simple solution with fixed return type

Simpler since PostgreSQL 9.1

Variable number of columns sharing the same type

Various complete table types

How does this work?

Erwin Brandstetter

Recent Activity

Donate For Us

Refactor a PL/pgSQL function to return the output of various SELECT queries

Tags:

sql

database

postgresql

dynamic-sql

plpgsql

What I have so far:

waldyr.ar

People also ask

1 Answers

Dynamic SQL and RETURN type

Simple solution with fixed return type

Simpler since PostgreSQL 9.1

Variable number of columns sharing the same type

Various complete table types

How does this work?

Erwin Brandstetter

Related questions

Recent Activity

Donate For Us

Dynamic SQL and `RETURN` type