I am trying to extract a substring from a text column using a regular expression, but in some cases, there are multiple instances of that substring in the string.
In those cases, I am finding that the query does not return the first occurrence of the substring. Does anyone know what I am doing wrong?
For example:
If I have this data:
create table data1
(full_text text, name text);
insert into data1 (full_text)
values ('I 56, donkey, moon, I 92')
I am using
UPDATE data1
SET name = substring(full_text from '%#"I ([0-9]{1,3})#"%' for '#')
and I want to get 'I 56'
not 'I 92'
Regex is a sequence of characters that defines a pattern that can filter data in PostgreSQL. The TILDE (~) operator and the wildcard operator “. *” is used to implement PostgreSQL's regular expressions.
PostgreSQL provides a built-in function named SUBSTRING() that extracts a substring from any specific string. The SUBSTRING() function accepts three parameters: a string, starting position, and length. The starting position” and “length” parameters are optional that can be skipped depending on the situation.
To execute an SQL statement with a single result row, EXECUTE can be used. To save the result, add an INTO clause. EXEC SQL BEGIN DECLARE SECTION; const char *stmt = "SELECT a, b, c FROM test1 WHERE a > ?"; int v1, v2; VARCHAR v3[50]; EXEC SQL END DECLARE SECTION; EXEC SQL PREPARE mystmt FROM :stmt; ...
The simplest and very common pattern matching character operators is the . This simply allows for any single character to match where a . is placed in a regular expression. For example /b.t/ can match to bat, bit, but or anything like bbt, bct ....
You can use regexp_matches()
instead:
update data1
set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1];
As no additional flag is passed, regexp_matches()
only returns the first match - but it returns an array so you need to pick the first (and only) element from the result (that's the [1]
part)
It is probably a good idea to limit the update to only rows that would match the regex in the first place:
update data1
set full_text = (regexp_matches(full_text, 'I [0-9]{1,3}'))[1]
where full_text ~ 'I [0-9]{1,3}'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With