I have a program to pass a full string of groups a user in OpenLDAP to Postgres query. The string is exactly like this:
( 'cn=user1,ou=org1,ou=suborg1,o=myorg','cn=user2,ou=org2,ou=suborg1,o=myorg','cn=user3,ou=org1,ou=suborg1,o=myorg','cn=user4,ou=org1,ou=suborg2,o=myorg' )
In a query, I only want that to be this in Postgres:
'user1','user3'
Basically extract value of cn= when the rest of the string is ou=org1,ou=suborg1,o=myorg.
user2 has ou=org2,ou=suborg1,o=myorg which is org2 so it won't match.
user4 won't match on suborg2 ,... The variation is unlimited so I like to look for exact match ou=org1,ou=suborg1,o=myorg only.
I know how to do replace but it can't handle unlimited scenarios. Is there a clean way to do this in regexp_replace or regexp_extract?
Probably the cleanest is by using SUBSTRING that can return just the captured substring:
SELECT SUBSTRING(strs FROM 'cn=([^,]+),ou=org1,ou=suborg1,o=myorg') FROM tb1;
Here, you match cn=, then capture into Group 1 any one or more chars other than , with the negated bracket expression [^,]+ and then match ,ou=org1,ou=suborg1,o=myorg to make sure there is your required right-hand context.
Else, you may try a REGEXP_REPLACE approach, but it will leave the values where no match is found intact:
SELECT REGEXP_REPLACE(strs, '.*cn=([^,]+),ou=org1,ou=suborg1,o=myorg.*', '\1') from tb1;
It matches any 0+ chars with .*, then cn=, again captures the non-comma chars into Group 1 and then matches ,ou=org1,ou=suborg1,o=myorg and 0+ chars to the end of the string.
See an online PostgreSQL demo:
CREATE TABLE tb1
    (strs character varying)
;
INSERT INTO tb1
    (strs)
VALUES
    ('cn=user1,ou=org1,ou=suborg1,o=myorg'),
    ('cn=user2,ou=org2,ou=suborg1,o=myorg'),
    ('cn=user3,ou=org1,ou=suborg1,o=myorg'),
    ('cn=user4,ou=org1,ou=suborg2,o=myorg')
;
SELECT REGEXP_REPLACE(strs, '.*cn=([^,]+),ou=org1,ou=suborg1,o=myorg.*', '\1') from tb1;
SELECT substring(strs from 'cn=([^,]+),ou=org1,ou=suborg1,o=myorg') from tb1;
Results:

Note you may leverage a very useful word boundary \y construct (see Table 9.20. Regular Expression Constraint Escapes) if you do not want to match ocn= with cn=,
'.*\ycn=([^,]+),ou=org1,ou=suborg1,o=myorg\y.*'
   ^^                                     ^^
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With