Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace REGEXP_SUBSTR in SQL Server

I'm Looking for a way of replacing the use of INSTR(...) and REPLACE(REGEXP_SUBSTR(...)) oracle functions in SQL Server.

Original Oracle:

   SELECT 
      Name,
      CASE
           WHEN SUBSTR (NAME, 1, 2) = 'CG'
           THEN SUBSTR (NAME,INSTR (NAME,'_',1,2)+ 1,LENGTH (NAME))
           ELSE REPLACE (REGEXP_SUBSTR (NAME,'_[^_]+',1,2),'_','')
        END AS OPT,
        CASE
           WHEN SUBSTR (NAME, 1, 2) = 'CG'
           THEN SUBSTR (NAME,INSTR (NAME, '_',1,1) + 1, LENGTH (NAME))
           ELSE REPLACE (REGEXP_SUBSTR (NAME,'_[^_]+',1,1),'_','')
        END as Name2

Updated Example Output:

+---------------------------------+----------------------+------------------------------+
|              NAME               |         OPT          |            Name2             |
+---------------------------------+----------------------+------------------------------+
| AE 344592001H 6186694           | NULL                 | NULL                         |
| AE_161038002_6044777            | 6044777              | 161038002                    |
| BC_VIVS_HNB011A_1WAM            | HNB011A              | VIVS                         |
| BC_56230A_30SP                  | 30SP                 | 56230A                       |
| CG_3334902_NETWK_ ACTLM_3334912 | NETWK_ ACTLM_3334912 | 3334902_NETWK_ ACTLM_3334912 |
| CG_3334574_HMO1_CORACT_3334575  | HMO1_CORACT_3334575  | 3334574_HMO1_CORACT_3334575  |
| CG_3207160_POSC_1502AH_3207161  | POSC_1502AH_3207161  | 3207160_POSC_1502AH_3207161  |
| UH_141015_RHM                   | RHM                  | 141015                       |
| UH_127757_RIV                   | RIV                  | 127757                       |
| UH 523725 RIV                   | NULL                 | NULL                         |
| BS_W0055785_C500_M0005672       | C500                 | W0055785                     |
+---------------------------------+----------------------+------------------------------+

I tried looking at charindex and patindex but nothing panned out, I think the original regex expression is over my head to begin with. Any ideas on how I could mimic this logic in sql server?

like image 869
russds Avatar asked Mar 27 '26 12:03

russds


1 Answers

SQL Fiddle

MS SQL Server 2014 Schema Setup:

CREATE TABLE table_name ( name VARCHAR(50) );
INSERT INTO table_name
SELECT 'AE 344592001H 6186694' UNION ALL
SELECT 'AE_161038002_6044777' UNION ALL
SELECT 'BC_VIVS_HNB011A_1WAM' UNION ALL
SELECT 'BC_56230A_30SP' UNION ALL
SELECT 'CG_3334902_NETWK_ ACTLM_3334912' UNION ALL
SELECT 'CG_3334574_HMO1_CORACT_3334575' UNION ALL
SELECT 'CG_3207160_POSC_1502AH_3207161' UNION ALL
SELECT 'UH_141015_RHM' UNION ALL
SELECT 'UH_127757_RIV' UNION ALL
SELECT 'UH 523725 RIV' UNION ALL
SELECT 'BS_W0055785_C500_M0005672';

Query 1:

WITH Names ( lvl, name, remaining, idx ) AS (
  SELECT 1,
         name,
         name,
         CHARINDEX( '_', name )
  FROM   table_name
  UNION ALL
  SELECT lvl+1,
         name,
         SUBSTRING(remaining,idx+1,LEN(remaining)-idx),
         CASE WHEN CHARINDEX( '_', remaining, idx+1 ) = 0
              THEN 0
              ELSE CHARINDEX( '_', remaining, idx+1 ) - idx
              END
  FROM   Names
  WHERE  idx > 0
)
SELECT Name,
       MAX( CASE WHEN lvl = 3 AND ( Name LIKE 'CG%' OR idx = 0 ) THEN remaining
                 WHEN lvl = 3 THEN SUBSTRING( remaining, 1, idx - 1 )
                 END ) AS OPT,
       MAX( CASE WHEN lvl = 2 AND ( Name LIKE 'CG%' OR idx = 0 ) THEN remaining
                 WHEN lvl = 2 THEN SUBSTRING( remaining, 1, idx - 1 )
                 END ) AS Name2
FROM   Names
GROUP BY Name

Results:

|                            Name |                  OPT |                        Name2 |
|---------------------------------|----------------------|------------------------------|
|           AE 344592001H 6186694 |               (null) |                       (null) |
|            AE_161038002_6044777 |              6044777 |                    161038002 |
|                  BC_56230A_30SP |                 30SP |                       56230A |
|            BC_VIVS_HNB011A_1WAM |              HNB011A |                         VIVS |
|       BS_W0055785_C500_M0005672 |                 C500 |                     W0055785 |
|  CG_3207160_POSC_1502AH_3207161 |  POSC_1502AH_3207161 |  3207160_POSC_1502AH_3207161 |
|  CG_3334574_HMO1_CORACT_3334575 |  HMO1_CORACT_3334575 |  3334574_HMO1_CORACT_3334575 |
| CG_3334902_NETWK_ ACTLM_3334912 | NETWK_ ACTLM_3334912 | 3334902_NETWK_ ACTLM_3334912 |
|                   UH 523725 RIV |               (null) |                       (null) |
|                   UH_127757_RIV |                  RIV |                       127757 |
|                   UH_141015_RHM |                  RHM |                       141015 |

Query 2:

CHARINDEX( expressionToFind, expressionToSerach[, startIndex] ) can be used to find the instances of _ in the word.

  • CHARINDEX( '_', name ) will find the index of the first instance of an _.
  • CHARINDEX( '_', name, CHARINDEX( '_', name ) + 1 ) will find the index of the second instance of an _ or will return 0 if there are not two _ characters.
  • CHARINDEX( '_', name, CHARINDEX( '_', name, CHARINDEX( '_', name ) + 1 ) + 1) will find the index of the third instance of an _ or will return 0 if there are not three _ characters.

Nesting this into an inner select you can use it to get the appropriate SUBSTRINGs in an outer select like this:

SELECT name,
       CASE WHEN idx2 > idx1 AND ( Name LIKE 'CG%' OR idx3 = 0 )THEN SUBSTRING( name, idx2 + 1, LEN( name ) )
            WHEN idx3 > idx2 THEN SUBSTRING( name, idx2 + 1, idx3 - idx2 - 1 )
            END AS OPT,
       CASE WHEN name LIKE 'CG%' THEN SUBSTRING( name, idx1 + 1, LEN( name ) )
            WHEN idx2 > idx1 THEN SUBSTRING( name, idx1 + 1, idx2 - idx1 - 1 )
            END AS Name2
FROM   (
  SELECT name,
         CHARINDEX( '_', name ) AS idx1,
         CHARINDEX( '_', name, CHARINDEX( '_', name ) + 1 ) AS idx2,
         CHARINDEX( '_', name, CHARINDEX( '_', name, CHARINDEX( '_', name ) + 1 ) + 1 ) AS idx3
  FROM   table_name
) t

Results:

|                            name |                  OPT |                        Name2 |
|---------------------------------|----------------------|------------------------------|
|           AE 344592001H 6186694 |               (null) |                       (null) |
|            AE_161038002_6044777 |              6044777 |                    161038002 |
|            BC_VIVS_HNB011A_1WAM |              HNB011A |                         VIVS |
|                  BC_56230A_30SP |                 30SP |                       56230A |
| CG_3334902_NETWK_ ACTLM_3334912 | NETWK_ ACTLM_3334912 | 3334902_NETWK_ ACTLM_3334912 |
|  CG_3334574_HMO1_CORACT_3334575 |  HMO1_CORACT_3334575 |  3334574_HMO1_CORACT_3334575 |
|  CG_3207160_POSC_1502AH_3207161 |  POSC_1502AH_3207161 |  3207160_POSC_1502AH_3207161 |
|                   UH_141015_RHM |                  RHM |                       141015 |
|                   UH_127757_RIV |                  RIV |                       127757 |
|                   UH 523725 RIV |               (null) |                       (null) |
|       BS_W0055785_C500_M0005672 |                 C500 |                     W0055785 |
like image 91
MT0 Avatar answered Mar 30 '26 03:03

MT0



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!