I have a field in a redshift column that looks like the following: <code>abcd1234df-TEXT_I-WANT</code> the characters and numbers in the first 10 digits can be either letters or numbers. If I use a capture group regex, I would use a poorly written expression like <code>(\w\w\w\w\w\w\w\w\w\w\W)(.*)</code> and grap the 2nd group But I'm having trouble implementing this in redshift, so not sure how I can grab only the stuff after the first hyphen

As mentioned before, regex might be an overkill. However, it could be useful in some cases. Here's a basic replace pattern: <pre class="prettyprint"><code>SELECT regexp_replace( 'abcd1234df-TEXT_I-WANT' -- use your input column here instead , '^[a-z0-9]{10}-(.*)$' -- matches whole string, captures "TEXT_I-WANT" in $1 , '$1' -- inserts $1 to return TEXT_I-WANT ) ; </code></pre>

@wp78de gives a very good advice to use REGEX_REPLACE. I allows you to choose the capture group. Using your regex, it would look like that, although you don't need 2 groups in here and using one is sufficient here. <pre class="prettyprint"><code>select regexp_replace( 'abcd1234df-TEXT_I-WANT', '(\\w\\w\\w\\w\\w\\w\\w\\w\\w\\w\\W)(.*)', '$2' -- replacement selecting 2nd capture group ); </code></pre> Another oprion, although less flexible is using REGEX_SUBSTR with <code>e</code> parameter set (Extract a substring using a subexpression). It allows you to select a substring, but only of a first capture group in your regex. You also have to set the position and occurence parameters to default <code>1</code>: Using REGEX you suggested, but only with 1 group: <pre class="prettyprint"><code>select regexp_substr( 'abcd1234df-TEXT_I-WANT', '\\w\\w\\w\\w\\w\\w\\w\\w\\w\\w\\W(.*)', 1, -- position 1, -- occurrence 'e' -- parameters ); </code></pre>

Regular expressions might be overkill. Basic string operations are good enough: <pre class="prettyprint"><code>select substring(col from position('-' in col) + 1) </code></pre>

How to use a regex capture group in redshift (or alternative)

3 Answers

As mentioned before, regex might be an overkill. However, it could be useful in some cases.

Here's a basic replace pattern:

SELECT
    regexp_replace(
      'abcd1234df-TEXT_I-WANT'  -- use your input column here instead
    , '^[a-z0-9]{10}-(.*)$'     -- matches whole string, captures "TEXT_I-WANT" in $1
    , '$1'                      -- inserts $1 to return TEXT_I-WANT
    )
;

133

answered Nov 01 '22 11:11

wp78de

@wp78de gives a very good advice to use REGEX_REPLACE. I allows you to choose the capture group. Using your regex, it would look like that, although you don't need 2 groups in here and using one is sufficient here.

select 
  regexp_replace(
    'abcd1234df-TEXT_I-WANT',
    '(\\w\\w\\w\\w\\w\\w\\w\\w\\w\\w\\W)(.*)', 
    '$2' -- replacement selecting 2nd capture group
  );

Another oprion, although less flexible is using REGEX_SUBSTR with e parameter set (Extract a substring using a subexpression). It allows you to select a substring, but only of a first capture group in your regex. You also have to set the position and occurence parameters to default 1:

Using REGEX you suggested, but only with 1 group:

select 
  regexp_substr(
    'abcd1234df-TEXT_I-WANT',
    '\\w\\w\\w\\w\\w\\w\\w\\w\\w\\w\\W(.*)', 
    1, -- position
    1, -- occurrence
    'e' -- parameters
  );

answered Nov 01 '22 09:11

botchniaque

Regular expressions might be overkill. Basic string operations are good enough:

select substring(col from position('-' in col) + 1)

answered Nov 01 '22 10:11

Gordon Linoff

Related questions
                            
                                View based on SELECT with 'WITH' clause
                            
                                Getting unique constraint column names from oracle database
                            
                                Listing rows which have the same value in a certain column in postgresql
                            
                                Define varchar with variable length
                            
                                MySQL Join Table which have had same column name
                            
                                SQLite - select row by year only?
                            
                                Red error lines on SQL temp tables
                            
                                SQL Like/Contains on BIGINT Column
                            
                                Connect to SQLite3 server using PyODBC, Python
                            
                                SQL search for words in any order
                            
                                SQL INNER JOINing 2 Subqueries
                            
                                In Sql Server 2014 ORDER BY clause with OFFSET FETCH NEXT returns weird results
                            
                                List all tables of a given user in Oracle
                            
                                Spark join throws 'function' object has no attribute '_get_object_id' error. How could I fix it?
                            
                                Select values between two columns range
                            
                                Sqllite : search value from json string
                            
                                Referring to an open database connection inside a function - Golang
                            
                                Any CONCAT() variation that tolerates NULL values?
                            
                                Filtering by RANK() in HAVING clause without subqueries
                            
                                how can I create role groups in postgresql

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use a regex capture group in redshift (or alternative)

Tags:

regex

sql

amazon-redshift

user1874064

People also ask

3 Answers

wp78de

botchniaque

Gordon Linoff

Recent Activity

Donate For Us