Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RegEx for SQL Server string to replace with unicode prefix

i have a similar problem like this: .NET Regex for SQL Server string... but not Unicode string?

The RegEx (?:N'(?:''|[^'])*'[^']*)*(?<!N)'(?<value>(?:''|[^'])*)' doesn't match this string correctly:

Insert into SomeTable (someColumns) values ('someValue', N'someValue', 'someValue')

it recognizes "N'someValue', 'someValue'" as a Match

I cant figure out to correct the RegEx to match all string literals but not the literals with the N-Prefix.

Like mentioned in the Link above the RegEx have to ignore escaped quotes in the space of the string like 'some '' escaped'

like image 214
deterministicFail Avatar asked May 20 '26 16:05

deterministicFail


1 Answers

In my opinion, there is a better tool for your job - the TSql100Parser class:

using Microsoft.Data.Schema.ScriptDom;
using Microsoft.Data.Schema.ScriptDom.Sql;
using System.Collections.Generic;
using System.IO;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        IList<ParseError> errors = new List<ParseError>();
        var tsql = @"
                Insert into SomeTable (someColumns) 
                values ('someValue1', 
                        N'someValue2', 
                        'someValue3',
                        'some '' escaped')";
        var result = GetLiterals(tsql);
    }

    private static List<string> 
        GetLiterals(string strQuery)
    {
        var parser = new TSql100Parser(false);
        IList<ParseError> errors = new List<ParseError>();
        var result = 
            parser.GetTokenStream(new StringReader(strQuery), errors);
        return result
            .Where(t =>
                t.TokenType == TSqlTokenType.AsciiStringLiteral ||
                t.TokenType == TSqlTokenType.UnicodeStringLiteral)
            .Select(t => t.Text)
            .ToList();
    }
}

You can't use Type-3 grammars (regular expression) to parse Type-0 grammars (T-SQL). The same stands when you try to parse HTML. It will not be 100% fail proof in real life.

like image 84
Alex Filipovici Avatar answered May 23 '26 12:05

Alex Filipovici