Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex in Postgres to extract full DN in OpenLDAP

I have a program to pass a full string of groups a user in OpenLDAP to Postgres query. The string is exactly like this:

( 'cn=user1,ou=org1,ou=suborg1,o=myorg','cn=user2,ou=org2,ou=suborg1,o=myorg','cn=user3,ou=org1,ou=suborg1,o=myorg','cn=user4,ou=org1,ou=suborg2,o=myorg' )

In a query, I only want that to be this in Postgres:

'user1','user3'

Basically extract value of cn= when the rest of the string is ou=org1,ou=suborg1,o=myorg.

user2 has ou=org2,ou=suborg1,o=myorg which is org2 so it won't match. user4 won't match on suborg2 ,... The variation is unlimited so I like to look for exact match ou=org1,ou=suborg1,o=myorg only.

I know how to do replace but it can't handle unlimited scenarios. Is there a clean way to do this in regexp_replace or regexp_extract?

like image 205
HP. Avatar asked Jul 04 '18 18:07

HP.


1 Answers

Probably the cleanest is by using SUBSTRING that can return just the captured substring:

SELECT SUBSTRING(strs FROM 'cn=([^,]+),ou=org1,ou=suborg1,o=myorg') FROM tb1;

Here, you match cn=, then capture into Group 1 any one or more chars other than , with the negated bracket expression [^,]+ and then match ,ou=org1,ou=suborg1,o=myorg to make sure there is your required right-hand context.

Else, you may try a REGEXP_REPLACE approach, but it will leave the values where no match is found intact:

SELECT REGEXP_REPLACE(strs, '.*cn=([^,]+),ou=org1,ou=suborg1,o=myorg.*', '\1') from tb1;

It matches any 0+ chars with .*, then cn=, again captures the non-comma chars into Group 1 and then matches ,ou=org1,ou=suborg1,o=myorg and 0+ chars to the end of the string.

See an online PostgreSQL demo:

CREATE TABLE tb1
    (strs character varying)
;

INSERT INTO tb1
    (strs)
VALUES
    ('cn=user1,ou=org1,ou=suborg1,o=myorg'),
    ('cn=user2,ou=org2,ou=suborg1,o=myorg'),
    ('cn=user3,ou=org1,ou=suborg1,o=myorg'),
    ('cn=user4,ou=org1,ou=suborg2,o=myorg')
;

SELECT REGEXP_REPLACE(strs, '.*cn=([^,]+),ou=org1,ou=suborg1,o=myorg.*', '\1') from tb1;
SELECT substring(strs from 'cn=([^,]+),ou=org1,ou=suborg1,o=myorg') from tb1;

Results:

enter image description here

Note you may leverage a very useful word boundary \y construct (see Table 9.20. Regular Expression Constraint Escapes) if you do not want to match ocn= with cn=,

'.*\ycn=([^,]+),ou=org1,ou=suborg1,o=myorg\y.*'
   ^^                                     ^^
like image 184
Wiktor Stribiżew Avatar answered Oct 12 '22 13:10

Wiktor Stribiżew