To forgo reading the entire problem, my basic question is: Is there a function in PostgreSQL to escape regular expression characters in a string? I've probed the documentation but was unable to find such a function. Here is the full problem: In a PostgreSQL database, I have a column with unique names in it. I also have a process which periodically inserts names into this field, and, to prevent duplicates, if it needs to enter a name that already exists, it appends a space and parentheses with a count to the end. i.e. Name, Name (1), Name (2), Name (3), etc. As it stands, I use the following code to find the next number to add in the series (written in plpgsql): <pre class="prettyprint"><code>var_name_id := 1; SELECT CAST(substring(a.name from E'\$(\\d+)\$$') AS int) INTO var_last_name_id FROM my_table.names a WHERE a.name LIKE var_name || ' (%)' ORDER BY CAST(substring(a.name from E'\$(\\d+)\$$') AS int) DESC LIMIT 1; IF var_last_name_id IS NOT NULL THEN var_name_id = var_last_name_id + 1; END IF; var_new_name := var_name || ' (' || var_name_id || ')'; </code></pre> (<code>var_name</code> contains the name I'm trying to insert.) This works for now, but the problem lies in the <code>WHERE</code> statement: <pre class="prettyprint"><code>WHERE a.name LIKE var_name || ' (%)' </code></pre> This check doesn't verify that the <code>%</code> in question is a number, and it doesn't account for multiple parentheses, as in something like "Name ((1))", and if either case existed a cast exception would be thrown. The <code>WHERE</code> statement really needs to be something more like: <pre class="prettyprint"><code>WHERE a.r1_name ~* var_name || E' \$\\d+\$' </code></pre> But <code>var_name</code> could contain regular expression characters, which leads to the question above: Is there a function in PostgreSQL that escapes regular expression characters in a string, so I could do something like: <pre class="prettyprint"><code>WHERE a.r1_name ~* regex_escape(var_name) || E' \$\\d+\$' </code></pre> Any suggestions are much appreciated, including a possible reworking of my duplicate name solution.

To address the question at the top: Assuming <code>standard_conforming_strings = on</code>, like it's default since Postgres 9.1. <h3>Regular expression escape function</h3> Let's start with a complete list of characters with special meaning in regular expression patterns: <pre class="prettyprint lang-none prettyprint-override"><code>!$()*+.:<=>?[\]^{|}- </code></pre> Wrapped in a bracket expression most of them lose their special meaning - with a few exceptions: <ul> <li> <code>-</code> needs to be first or last or it signifies a range of characters.</li> <li> <code>]</code> and <code>\</code> have to be escaped with <code>\</code> (in the replacement, too).</li> </ul> After adding capturing parentheses for the back reference below we get this regexp pattern: <pre class="prettyprint lang-sql prettyprint-override"><code>([!$()*+.:<=>?[\\\]^{|}-]) </code></pre> Using it, this function escapes all special characters with a backslash (<code>\</code>) - thereby removing the special meaning: <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE OR REPLACE FUNCTION f_regexp_escape(text) RETURNS text LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS $func$ SELECT regexp_replace($1, '([!$()*+.:<=>?[\\\]^{|}-])', '\\\1', 'g') $func$; </code></pre> Add <code>PARALLEL SAFE</code> (because it is) in Postgres 10 or later to allow parallelism for queries using it. <h3>Demo</h3> <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT f_regexp_escape('test(1) > Foo*'); </code></pre> Returns: <pre class="prettyprint lang-none prettyprint-override"><code>test$1$ \> Foo\* </code></pre> And while: <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT 'test(1) > Foo*' ~ 'test(1) > Foo*'; </code></pre> returns <code>FALSE</code>, which may come as a surprise to naive users, <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT 'test(1) > Foo*' ~ f_regexp_escape('test(1) > Foo*'); </code></pre> Returns <code>TRUE</code> as it should now. <h3> <code>LIKE</code> escape function</h3> For completeness, the pendant for <code>LIKE</code> patterns, where only three characters are special: <pre class="prettyprint lang-none prettyprint-override"><code>\%_ </code></pre> The manual: <blockquote> The default escape character is the backslash but a different one can be selected by using the <code>ESCAPE</code> clause. </blockquote> This function assumes the default: <pre class="prettyprint lang-sql prettyprint-override"><code>CREATE OR REPLACE FUNCTION f_like_escape(text) RETURNS text LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS $func$ SELECT replace(replace(replace($1 , '\', '\\') -- must come 1st , '%', '\%') , '_', '\_'); $func$; </code></pre> We could use the more elegant <code>regexp_replace()</code> here, too, but for the few characters, a cascade of <code>replace()</code> functions is faster. Again, <code>PARALLEL SAFE</code> in Postgres 10 or later. <h3>Demo</h3> <pre class="prettyprint lang-sql prettyprint-override"><code>SELECT f_like_escape('20% \ 50% low_prices'); </code></pre> Returns: <pre class="prettyprint lang-none prettyprint-override"><code>20\% \\ 50\% low\_prices </code></pre>

Escape function for regular expression or LIKE patterns

Tags:

regex

pattern-matching

escaping

postgresql

plpgsql

To forgo reading the entire problem, my basic question is:
Is there a function in PostgreSQL to escape regular expression characters in a string?

I've probed the documentation but was unable to find such a function.

Here is the full problem:

In a PostgreSQL database, I have a column with unique names in it. I also have a process which periodically inserts names into this field, and, to prevent duplicates, if it needs to enter a name that already exists, it appends a space and parentheses with a count to the end.

i.e. Name, Name (1), Name (2), Name (3), etc.

As it stands, I use the following code to find the next number to add in the series (written in plpgsql):

var_name_id := 1;

SELECT CAST(substring(a.name from E'\\((\\d+)\\)$') AS int)
INTO var_last_name_id
FROM my_table.names a
WHERE a.name LIKE var_name || ' (%)'
ORDER BY CAST(substring(a.name from E'\\((\\d+)\\)$') AS int) DESC
LIMIT 1;

IF var_last_name_id IS NOT NULL THEN
    var_name_id = var_last_name_id + 1;
END IF;

var_new_name := var_name || ' (' || var_name_id || ')';

(var_name contains the name I'm trying to insert.)

This works for now, but the problem lies in the WHERE statement:

WHERE a.name LIKE var_name || ' (%)'

This check doesn't verify that the % in question is a number, and it doesn't account for multiple parentheses, as in something like "Name ((1))", and if either case existed a cast exception would be thrown.

The WHERE statement really needs to be something more like:

WHERE a.r1_name ~* var_name || E' \\(\\d+\\)'

But var_name could contain regular expression characters, which leads to the question above: Is there a function in PostgreSQL that escapes regular expression characters in a string, so I could do something like:

WHERE a.r1_name ~* regex_escape(var_name) || E' \\(\\d+\\)'

Any suggestions are much appreciated, including a possible reworking of my duplicate name solution.

762

asked Feb 28 '11 15:02

Benny

1 Answers

To address the question at the top:

Assuming standard_conforming_strings = on, like it's default since Postgres 9.1.

Regular expression escape function

Let's start with a complete list of characters with special meaning in regular expression patterns:

!$()*+.:<=>?[\]^{|}-

Wrapped in a bracket expression most of them lose their special meaning - with a few exceptions:

- needs to be first or last or it signifies a range of characters.
] and \ have to be escaped with \ (in the replacement, too).

After adding capturing parentheses for the back reference below we get this regexp pattern:

([!$()*+.:<=>?[\\\]^{|}-])

Using it, this function escapes all special characters with a backslash (\) - thereby removing the special meaning:

CREATE OR REPLACE FUNCTION f_regexp_escape(text)
  RETURNS text
  LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS
$func$
SELECT regexp_replace($1, '([!$()*+.:<=>?[\\\]^{|}-])', '\\\1', 'g')
$func$;

Add PARALLEL SAFE (because it is) in Postgres 10 or later to allow parallelism for queries using it.

Demo

SELECT f_regexp_escape('test(1) > Foo*');

Returns:

test\(1\) \> Foo\*

And while:

SELECT 'test(1) > Foo*' ~ 'test(1) > Foo*';

returns FALSE, which may come as a surprise to naive users,

SELECT 'test(1) > Foo*' ~ f_regexp_escape('test(1) > Foo*');

Returns TRUE as it should now.

`LIKE` escape function

For completeness, the pendant for LIKE patterns, where only three characters are special:

\%_

The manual:

The default escape character is the backslash but a different one can be selected by using the ESCAPE clause.

This function assumes the default:

CREATE OR REPLACE FUNCTION f_like_escape(text)
  RETURNS text
  LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS
$func$
SELECT replace(replace(replace($1
         , '\', '\\')  -- must come 1st
         , '%', '\%')
         , '_', '\_');
$func$;

We could use the more elegant regexp_replace() here, too, but for the few characters, a cascade of replace() functions is faster.

Again, PARALLEL SAFE in Postgres 10 or later.

Demo

SELECT f_like_escape('20% \ 50% low_prices');

Returns:

20\% \\ 50\% low\_prices

164

answered Sep 24 '22 16:09

Erwin Brandstetter

Related questions
                            
                                Match a pattern not followed by a sub-pattern in Vim [duplicate]
                            
                                Javascript Unexpected control character(s) in regular expression
                            
                                How to replace all occurrences of a string except the first one in JavaScript?
                            
                                java escape parenthesis
                            
                                Javascript replace matched group
                            
                                Regex for english characters, hyphen and underscore
                            
                                Split a string to groups of 2 chars using split?
                            
                                How do you store custom constants in Rails 4?
                            
                                regex.sub() gives different results to re.sub()
                            
                                Using positive-lookahead (?=regex) with re2
                            
                                Java: Search in HashMap keys based on regex?
                            
                                Python regex for reading CSV-like rows
                            
                                Does REGEX differ from PHP to Python
                            
                                Javascript replace regex wildcard
                            
                                regex - match character which is not escaped
                            
                                Replace everything between [ and ] in regex java
                            
                                Grep and regex - why am I escaping curly braces?
                            
                                Turning string with embedded brackets into a dictionary
                            
                                Match regex from right to left?
                            
                                NotePad++ replace problem

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Escape function for regular expression or LIKE patterns

Tags:

regex

pattern-matching

escaping

postgresql

plpgsql

Benny

People also ask

1 Answers

Regular expression escape function

Demo

`LIKE` escape function

Demo

Erwin Brandstetter

Recent Activity

Donate For Us

Escape function for regular expression or LIKE patterns

Tags:

regex

pattern-matching

escaping

postgresql

plpgsql

Benny

People also ask

1 Answers

Regular expression escape function

Demo

LIKE escape function

Demo

Erwin Brandstetter

Related questions

Recent Activity

Donate For Us

`LIKE` escape function