I have the same question as this: Splitting a comma-separated field in Postgresql and doing a UNION ALL on all the resulting tables Just that my 'fruits' column is delimited by '|'. When I try: <pre class="prettyprint"><code>SELECT yourTable.ID, regexp_split_to_table(yourTable.fruits, E'|') AS split_fruits FROM yourTable </code></pre> I get the following: <blockquote> <pre class="prettyprint"><code>ERROR: type "e" does not exist </code></pre> </blockquote> Q1. What does the <code>E</code> do? I saw some examples where <code>E</code> is not used. The official docs don't explain it in their "quick brown fox..." example. Q2. How do I use '|' as the delimiter for my query? Edit: I am using PostgreSQL 8.0.2. unnest() and regexp_split_to_table() both are not supported.

<h3>A1</h3> <code>E</code> is a prefix for Posix-style escape strings. You don't normally need this in modern Postgres. Only prepend it if you want to interpret special characters in the string. Like <code>E'\n' for a newline char.</code>Details and links to documentation: <ul> <li>Insert text with single quotes in PostgreSQL</li> <li>SQL select where column begins with \</li> </ul> <code>E</code> is pointless noise in your query, but it should still work. The answer you are linking to is not very good, I am afraid. <h3>A2</h3> Should work as is. But better without the <code>E</code>. <pre class="prettyprint"><code>SELECT id, regexp_split_to_table(fruits, '|') AS split_fruits FROM tbl; </code></pre> For simple delimiters, you don't need expensive regular expressions. This is typically faster: <pre class="prettyprint"><code>SELECT id, unnest(string_to_array(fruits, '|')) AS split_fruits FROM tbl; </code></pre> In Postgres 9.3+ you'd rather use a <code>LATERAL</code> join for set-returning functions: <pre class="prettyprint"><code>SELECT t.id, f.split_fruits FROM tbl t LEFT JOIN LATERAL unnest(string_to_array(fruits, '|')) AS f(split_fruits) ON true; </code></pre> Details: <ul> <li>What is the difference between LATERAL and a subquery in PostgreSQL?</li> <li>PostgreSQL unnest() with element number</li> </ul> <h3>Amazon Redshift is not Postgres</h3> It only implements a reduced set of features as documented in its manual. In particular, there are no table functions, including the essential functions <code>unnest()</code>, <code>generate_series()</code> or <code>regexp_split_to_table()</code> when working with its "compute nodes" (accessing any tables). You should go with a normalized table layout to begin with (extra table with one fruit per row). Or here are some options to create a set of rows in Redshift: <ul> <li>How to select multiple rows filled with constants in Amazon Redshift?</li> </ul> This workaround should do it: <ol> <li> Create a table of numbers, with at least as many rows as there can be fruits in your column. Temporary or permanent if you'll keep using it. Say we never have more than 9: <pre class="prettyprint"><code>CREATE TEMP TABLE nr9(i int); INSERT INTO nr9(i) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9); </code></pre> </li> <li> Join to the number table and use <code>split_part()</code>, which is actually implemented in Redshift: <pre class="prettyprint"><code>SELECT *, split_part(t.fruits, '|', n.i) As fruit FROM nr9 n JOIN tbl t ON split_part(t.fruits, '|', n.i) <> '' </code></pre> </li> </ol> Voilá.

Error while using regexp_split_to_table (Amazon Redshift)

Tags:

set-returning-functions

amazon-redshift

I have the same question as this:
Splitting a comma-separated field in Postgresql and doing a UNION ALL on all the resulting tables
Just that my 'fruits' column is delimited by '|'. When I try:

SELECT 
    yourTable.ID, 
    regexp_split_to_table(yourTable.fruits, E'|') AS split_fruits
FROM yourTable

I get the following:

ERROR: type "e" does not exist

Q1. What does the E do? I saw some examples where E is not used. The official docs don't explain it in their "quick brown fox..." example.

Q2. How do I use '|' as the delimiter for my query?

Edit: I am using PostgreSQL 8.0.2. unnest() and regexp_split_to_table() both are not supported.

660

asked Mar 10 '15 22:03

Reise45

1 Answers

A1

E is a prefix for Posix-style escape strings. You don't normally need this in modern Postgres. Only prepend it if you want to interpret special characters in the string. Like E'\n' for a newline char.Details and links to documentation:

Insert text with single quotes in PostgreSQL
SQL select where column begins with \

E is pointless noise in your query, but it should still work. The answer you are linking to is not very good, I am afraid.

A2

Should work as is. But better without the E.

SELECT id, regexp_split_to_table(fruits, '|') AS split_fruits
FROM   tbl;

For simple delimiters, you don't need expensive regular expressions. This is typically faster:

SELECT id, unnest(string_to_array(fruits, '|')) AS split_fruits
FROM   tbl;

In Postgres 9.3+ you'd rather use a LATERAL join for set-returning functions:

SELECT t.id, f.split_fruits
FROM   tbl t
LEFT   JOIN LATERAL unnest(string_to_array(fruits, '|')) AS f(split_fruits)
                                                                   ON true;

Details:

What is the difference between LATERAL and a subquery in PostgreSQL?
PostgreSQL unnest() with element number

Amazon Redshift is not Postgres

It only implements a reduced set of features as documented in its manual. In particular, there are no table functions, including the essential functions unnest(), generate_series() or regexp_split_to_table() when working with its "compute nodes" (accessing any tables).

You should go with a normalized table layout to begin with (extra table with one fruit per row).

Or here are some options to create a set of rows in Redshift:

How to select multiple rows filled with constants in Amazon Redshift?

This workaround should do it:

Create a table of numbers, with at least as many rows as there can be fruits in your column. Temporary or permanent if you'll keep using it. Say we never have more than 9:
```
CREATE TEMP TABLE nr9(i int);
INSERT INTO nr9(i) VALUES (1),(2),(3),(4),(5),(6),(7),(8),(9);
```

Join to the number table and use split_part(), which is actually implemented in Redshift:

SELECT *, split_part(t.fruits, '|', n.i) As fruit
FROM   nr9 n
JOIN   tbl t ON split_part(t.fruits, '|', n.i) <> ''

Voilá.

183

answered Jan 04 '23 05:01

Erwin Brandstetter

Related questions
                            
                                MongoDB into AWS Redshift
                            
                                Synchronize data from MySql to Amazon RedShift
                            
                                Version control for Tableau
                            
                                What does the column skew_sorkey1 in Amazon Redshift's svv_table_info imply?
                            
                                If you change a user's redshift password, will any pre-existing connections for that user remain valid?
                            
                                Where can I find usage statistics in Redshift?
                            
                                upload pandas dataframe to redshift - relation "sqlite_master" does not exist
                            
                                S3 to Redshift input data format
                            
                                Epoch to timeformat 'YYYY-MM-DD HH:MI:SS' while redshift copy
                            
                                Copy to Redshift from another accounts S3 bucket
                            
                                Unload data from postgres to s3
                            
                                Load Parquet files into Redshift
                            
                                AZ64 compression format performance
                            
                                Insert Zipped File into RedShift
                            
                                Why do I get "Your account does not support the EC2-Classic Platform in this region."?
                            
                                Amazon Redshift: Copying Data Between Databases
                            
                                Using Sequelize with Redshift
                            
                                Export data from Amazon Redshift as JSON
                            
                                How do you UNLOAD data to S3 from Redshift and include a date in the filename
                            
                                Hive -- split data across files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With