Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex pattern inside SQL Replace function?

SELECT REPLACE('<strong>100</strong><b>.00 GB', '%^(^-?\d*\.{0,1}\d+$)%', ''); 

I want to replace any markup between two parts of the number with above regex, but it does not seem to work. I'm not sure if it is regex syntax that's wrong because I tried simpler one such as '%[^0-9]%' just to test but it didn't work either. Does anyone know how can I achieve this?

like image 263
JanT Avatar asked Jan 27 '14 10:01

JanT


People also ask

Can you use RegEx in replace function SQL?

REGEXP_REPLACE extends the functionality of the REPLACE function by letting you search a string for a regular expression pattern. By default, the function returns source_char with every occurrence of the regular expression pattern replaced with replace_string .

How do you replace a pattern in SQL?

SELECT REPLACE(REPLACE(REPLACE(REPLACE('3*[4+5]/{6-8}', '[', '('), ']', ')'), '{', '('), '}', ')'); We can see that the REPLACE function is nested and it is called multiple times to replace the corresponding string as per the defined positional values within the SQL REPLACE function.

Can I use RegEx in replace?

How to use RegEx with . replace in JavaScript. To use RegEx, the first argument of replace will be replaced with regex syntax, for example /regex/ . This syntax serves as a pattern where any parts of the string that match it will be replaced with the new substring.

What does replace () do in SQL?

The REPLACE() function replaces all occurrences of a substring within a string, with a new substring. Note: The search is case-insensitive. Tip: Also look at the STUFF() function.


2 Answers

You can use PATINDEX to find the first index of the pattern (string's) occurrence. Then use STUFF to stuff another string into the pattern(string) matched.

Loop through each row. Replace each illegal characters with what you want. In your case replace non numeric with blank. The inner loop is if you have more than one illegal character in a current cell that of the loop.

DECLARE @counter int  SET @counter = 0  WHILE(@counter < (SELECT MAX(ID_COLUMN) FROM Table)) BEGIN        WHILE 1 = 1     BEGIN         DECLARE @RetVal varchar(50)          SET @RetVal =  (SELECT Column = STUFF(Column, PATINDEX('%[^0-9.]%', Column),1, '')         FROM Table         WHERE ID_COLUMN = @counter)          IF(@RetVal IS NOT NULL)                  UPDATE Table SET           Column = @RetVal           WHERE ID_COLUMN = @counter         ELSE             break     END      SET @counter = @counter + 1 END 

Caution: This is slow though! Having a varchar column may impact. So using LTRIM RTRIM may help a bit. Regardless, it is slow.

Credit goes to this StackOverFlow answer.

EDIT Credit also goes to @srutzky

Edit (by @Tmdean) Instead of doing one row at a time, this answer can be adapted to a more set-based solution. It still iterates the max of the number of non-numeric characters in a single row, so it's not ideal, but I think it should be acceptable in most situations.

WHILE 1 = 1 BEGIN     WITH q AS         (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n         FROM Table)     UPDATE Table     SET Column = STUFF(Column, q.n, 1, '')     FROM q     WHERE Table.ID_Column = q.ID_Column AND q.n != 0;      IF @@ROWCOUNT = 0 BREAK; END; 

You can also improve efficiency quite a lot if you maintain a bit column in the table that indicates whether the field has been scrubbed yet. (NULL represents "Unknown" in my example and should be the column default.)

DECLARE @done bit = 0; WHILE @done = 0 BEGIN     WITH q AS         (SELECT ID_Column, PATINDEX('%[^0-9.]%', Column) AS n         FROM Table         WHERE COALESCE(Scrubbed_Column, 0) = 0)     UPDATE Table     SET Column = STUFF(Column, q.n, 1, ''),         Scrubbed_Column = 0     FROM q     WHERE Table.ID_Column = q.ID_Column AND q.n != 0;      IF @@ROWCOUNT = 0 SET @done = 1;      -- if Scrubbed_Column is still NULL, then the PATINDEX     -- must have given 0     UPDATE table     SET Scrubbed_Column = CASE         WHEN Scrubbed_Column IS NULL THEN 1         ELSE NULLIF(Scrubbed_Column, 0)     END; END; 

If you don't want to change your schema, this is easy to adapt to store intermediate results in a table valued variable which gets applied to the actual table at the end.

like image 162
Mukus Avatar answered Sep 21 '22 10:09

Mukus


Instead of stripping out the found character by its sole position, using Replace(Column, BadFoundCharacter, '') could be substantially faster. Additionally, instead of just replacing the one bad character found next in each column, this replaces all those found.

WHILE 1 = 1 BEGIN     UPDATE dbo.YourTable     SET Column = Replace(Column, Substring(Column, PatIndex('%[^0-9.-]%', Column), 1), '')     WHERE Column LIKE '%[^0-9.-]%'     If @@RowCount = 0 BREAK; END; 

I am convinced this will work better than the accepted answer, if only because it does fewer operations. There are other ways that might also be faster, but I don't have time to explore those right now.

like image 28
ErikE Avatar answered Sep 18 '22 10:09

ErikE