Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server replace string based on another table

I have a string and i would like to replace some words in it by referring a lookup table

create table LookupTab 
(
     oldvalue varchar(100),
     newvalue varchar(100)
);

insert into LookupTab 
values ('Run', 'Run Go'), ('Hide', 'Hide Mask'), ('Go', 'Go Run'), ('Mask', 'Mask Hide')

Expected output

string ='i have to go'     
result ='i have to Go Run'   <-- it should not again replace the word Run

string ='i have to go and go again'     
result ='i have to Go Run and Go Run again'

What I have tried

CREATE FUNCTION [dbo].[TranslateString]
    (@Str nvarchar(max))
RETURNS nvarchar(max)
AS
BEGIN
    DECLARE @Result nvarchar(max) = @Str;

    SELECT @Result = REPLACE(@Result, Oldvalue, NewValue) 
    FROM LookupTab; 

    RETURN @Result;
END

but it replaces the replaced word again

like image 337
Surensiveaya Avatar asked Nov 21 '25 22:11

Surensiveaya


2 Answers

Demo on dbfiddle

2 steps:

  1. Find all the words to replace.
  2. Replace with each @OldValue to @NewValue word accordingly.
CREATE FUNCTION [dbo].[TranslateString]
(
 @Str nvarchar(max)
)RETURNS nvarchar(max)
AS
BEGIN
  DECLARE @OldValue nvarchar(100);
  DECLARE @NewValue nvarchar(100);
  DECLARE @CHARINDEX INT = 0;
  DECLARE @Result nvarchar(100) = @Str;
  DECLARE @TempTable AS TABLE(OldValue varchar(100), NewValue varchar(100), isApply BIT) 

  --1. Region: Find all the words to replace
   WHILE (@CHARINDEX < LEN(@Str))
   BEGIN
     SELECT TOP 1 @OldValue = OldValue, @NewValue = newvalue, @CHARINDEX = CHARINDEX(oldvalue, @Str) 
      FROM LookupTab
        WHERE CHARINDEX(oldvalue, @Str) > @CHARINDEX
        ORDER BY CHARINDEX(oldvalue, @Str)

     IF(ISNULL(@OldValue, '') != '' AND NOT EXISTS(SELECT TOP 1 1 FROM @TempTable WHERE OldValue = @OldValue))   
         INSERT INTO @TempTable(OldValue, NewValue)
         VALUES(@OldValue, @NewValue)

     SET @CHARINDEX = @CHARINDEX + LEN(@OldValue);
   END
 --1. End-Region: Find all the words to replace

  --2. Region: Replace with each @OldValue to @NewValue word accordingly
  WHILE(EXISTS(SELECT OldValue FROM @TempTable WHERE ISNULL(isApply, 0) = 0))
  BEGIN
       SELECT @OldValue = OldValue, @NewValue = NewValue FROM @TempTable WHERE ISNULL(isApply, 0) = 0

       SET @Result = replace(@Result,@Oldvalue,@NewValue);
       UPDATE @TempTable SET isApply = 1 WHERE OldValue = @OldValue
  END
  --2. End-Region: Replace with each @OldValue to @NewValue word accordingly

  RETURN @Result;
END

Output enter image description here

Updated 2020-01-20

A new solution to fix some exceptional cases. Demo in db<>fiddle

  • Create a strSplit function to be able to split each word into a table
  • Replace each word by ISNULL(l.newvalue, s.val)
  • Join all word after replacing into @Result then return.
    CREATE FUNCTION [dbo].[TranslateString]
    (
     @Str nvarchar(max)
    )RETURNS nvarchar(max)
    AS
    BEGIN
        DECLARE @Result NVARCHAR(MAX)
        ;WITH cte_TempTable AS(
           select ISNULL(l.newvalue, s.val) AS Value 
           from strSplit(@Str, ' ') s
           left join LookupTab l on s.val = l.oldvalue
        )
        SELECT @Result = (SELECT Value + ' ' FROM cte_TempTable FOR XML PATH(''))

      RETURN @Result;
    END

Output enter image description here

like image 170
Nguyễn Văn Phong Avatar answered Nov 23 '25 14:11

Nguyễn Văn Phong


This is a challenging problem. Recursive CTEs can be used to replace the strings. However, you don't want to replace already replaced strings. Ouch.

To solve this, you can use two rounds of replacement. The first puts in a placeholder for the old values. The second puts in the new values.

This looks like:

with cte as (
      select convert(varchar(max), v.str) as str, 1 as lev, str as orig_str
      from (values ('i have to go'), ('i have to Go Run')) v(str)
      union all
      select replace(cte.str, lt.oldvalue, concat('[', lt.ord, ']')), 1 + cte.lev, cte.orig_str
      from cte join
           lookuptab lt
           on lt.ord = cte.lev
     ),
     cte2 as (
      select cte.str, 1 as lev, cte.orig_str
      from (select cte.*, row_number() over (partition by cte.orig_str order by lev desc) as seqnum
            from cte
           ) cte
      where seqnum = 1
      union all
      select replace(cte2.str, concat('[', lt.ord, ']'), lt.newvalue), 1 + cte2.lev, cte2.orig_str
      from cte2 join
           lookuptab lt
           on lt.ord = cte2.lev
     )
select top (1) with ties str, orig_str
from cte2
order by row_number() over (partition by orig_str order by lev desc);

Here is a db<>fiddle.

You can add the ord column using row_number() in a CTE, but I would recommend adding it to the table.

This logic is also easily incorporated into a UDF, if you want that. I don't recommend a UDF for this purpose.

like image 45
Gordon Linoff Avatar answered Nov 23 '25 13:11

Gordon Linoff



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!