Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can't use the output of Redshift catalog queries

I am having all kinds of problems working with queries against the Redshift catalog tables.

To illustrate, the following works:

select "table_name"::text as "table"
from "information_schema"."tables"
where table_schema not like 'pg_%' and table_schema != 'information_schema'

and the following works:

create view works as 
select "table_name"::text as "table"
from "information_schema"."tables"
where table_schema not like 'pg_%' and table_schema != 'information_schema'

But the following fails:

create table fails as
select "table_name"::text as "table"
from "information_schema"."tables"
where table_schema not like 'pg_%' and table_schema != 'information_schema'

With:

[SQL]create table fails as
select "table_name"::text as "table"
from "information_schema"."tables"
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
INFO:  Function "has_table_privilege(oid,text)" not supported.
[Err] ERROR:  Specified types or functions (one per INFO message) not supported on Redshift tables.

From http://docs.aws.amazon.com/redshift/latest/dg/c_join_PG.html I read

If you write a join query that explicitly or implicitly references a column that has an unsupported data type, the query returns an error.

Does this mean that in a create table based on a select against catalog tables (even though I cast the weird field types to text) that under the hood Redshift is doing joins and weird stuff which means I can't do this?

Create table is one manifestation of the problem. Another is that I can't unload a view or anything based on a catalog query. E.g. the following will also fail with similar error messages to the above.

unload ('select * from "works"') to 's3://etc'

At the moment it seems the only way I can work with this data is to issue a query from an external program, and then have that external program write the resultset back manually to a table. i.e. it can't be done from within the database.

Does anybody have another solution?

like image 436
Nicholas Avatar asked Feb 26 '16 05:02

Nicholas


1 Answers

I've come across a similar problem, am unsure of the details of the cause but have found a workaround.

Instead of looking up the values in information_schema, try looking up relation and attribute names in the pg_catalog tables.

For example, the following query provides the column names for a particular table:

SELECT attname::text FROM pg_attribute WHERE attrelid = (SELECT oid FROM pg_class WHERE relname = '<your_table_name>') AND attname NOT IN ('insertxid', 'deletexid', 'oid', 'tableoid', 'xmin', 'cmin', 'xmax', 'cmax', 'ctid');

This query can be used in a CREATE TABLE statement:

CREATE TABLE consumer_person_dated_attr_types AS
SELECT attname::text FROM pg_attribute 
WHERE attrelid = (SELECT oid FROM pg_class 
    WHERE relname = '<your_table>') AND attname NOT IN ('oid', 'tableoid', 'xmin', 'cmin', 'xmax', 'cmax', 'ctid'
);

Similarly, the following query creates a table containing one column for table name, and another for schema name:

CREATE TABLE tmp_table_names AS
SELECT relname::text, nspname::text
FROM pg_class c
JOIN pg_namespace n
ON n.oid = c.relnamespace
WHERE nspname NOT IN ('pg_catalog', 'pg_toast', 'information_schema');

Note that the catalog tables provide many more system level details than information_schema does. For example, each table has internal system columns that are returned by the query above, so if you just want the column names for the columns defined in your DDL, you need to exclude the internal system columns. In addition to the columns listed there, RedShift returns deletexid and insertxid from the above query, so those should be excluded as well. The same goes for the query for the list of tables (i.e. there are many system schemas that are returned).

I suspect that this relates to the data types of the columns. The data types of many columns in information_schema are 'sql_identifier' with JDBC types of 'OTHER' (when viewed in SQLWorkbenchJ), whereas pg_catalog tables for similar columns have data types of 'name' and JDBC types of 'VARCHAR'.

like image 134
Ken Avatar answered Oct 12 '22 10:10

Ken