Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I accurately handle a batch separator for SQL from C#

Tags:

c#

sql

tsql

parsing

For Data Explorer I would like to add support for a Batch separator.

So for example if users type in:

 
select 'GO' go select 1 as go 
Go 
select 100

I would like to return the three result sets.

Its clear that I need some sort of parser here, my hope is that this is a solved problem and I can just plug it in. (writing a full T-SQL parser is not something I would like to do)

What component / demo code could achieve splitting this batch into its 3 parts?

like image 358
Sam Saffron Avatar asked May 31 '10 01:05

Sam Saffron


2 Answers

It's not often that I say this, but this is a case where I'd definitely advocate bending the user input to conform to computer rules rather than trying to solve the problem of having a computer understand massively varied user input

Impose a simple rule of: The word "go" must appear on its own line in order to be interpreted as a command to proceed

If your users can't adhere to a rule like that, should they really be engaged in the far more complex task of writing SQL queries?

like image 159
matt Avatar answered Oct 20 '22 15:10

matt


I was looking for solution of the same problem, but didn't found any suitable (use of SMO was not acceptable in my case). So, I had to write own parser. Here it is:

static IEnumerable<string> ParseSqlBatch(Stream s)
{
    if (s == null)
        throw new ArgumentNullException();

    StringBuilder sbSqlStatement = new StringBuilder();
    Stack<string> state = new Stack<string>();
    StreamReader sr = new StreamReader(s);

    //initially search for "GO" or open tag of strings ('), comments (--, /*) or identifiers ([)
    string pattern = @"(?>(?<=^\s*)go(?=\s*(--.*)?$)|''(?!')|(?<!')'|(?<!\[)\[|--(?=.*)?|/\*)";
    //if open tag found search for close tag, then continue search
    string patternCloseString = @"(?>''|'(?!'))";
    string patternCloseIdentifier = @"(?>\]\]|\](?!\]))";
    string patternComments = @"(?>\*/|/\*)";

    Regex rx = new Regex(pattern, RegexOptions.IgnoreCase);

    while (!sr.EndOfStream)
    {
        string line = sr.ReadLine();

        int ix = 0;
        bool bBreak = false;
        while (ix < line.Length && !bBreak)
        {
            Match m = rx.Match(line, ix);

            if (!m.Success)
            {
                sbSqlStatement.Append(line.Substring(ix));
                break;
            }

            int ix2 = m.Index;
            string word = m.Value;

            sbSqlStatement.Append(line.Substring(ix, ix2 - ix));

            if (state.Count == 0)
            {
                if (string.Compare(word, "GO", true) == 0)
                {
                    if (sbSqlStatement.Length > 0)
                    {
                        yield return sbSqlStatement.ToString();
                        sbSqlStatement = new StringBuilder();
                        break;
                    }
                }
                else
                {
                    switch (word)
                    {
                        case "'":
                            rx = new Regex(patternCloseString);
                            break;
                        case "[":
                            rx = new Regex(patternCloseIdentifier);
                            break;
                        case "/*":
                            rx = new Regex(patternComments);
                            break;
                        case "--":
                            sbSqlStatement.Append(line.Substring(ix2));
                            bBreak = true;
                            continue;
                    }

                    if (word != "''")
                        state.Push(word);
                }
            }
            else
            {
                string st = state.Peek();

                switch (st)
                {
                    case "'":
                        if (st == word)
                            state.Pop();
                        break;
                    case "[":
                        if (word == "]")
                            state.Pop();
                        break;
                    case "/*":
                        if (word == "*/")
                            state.Pop();
                        else if (word == "/*")
                            state.Push(word);
                        break;
                }

                if (state.Count == 0)
                    rx = new Regex(pattern, RegexOptions.IgnoreCase);
            }

            ix = ix2 + word.Length;
            sbSqlStatement.Append(word);
        }

        sbSqlStatement.AppendLine();
    }

    if (sbSqlStatement.Length > 0)
        yield return sbSqlStatement.ToString();
}

It correctly handles "GO" within strings, identifiers and comments. Perhaps not ideal one, but tested successfully over hundreds of various .sql scripts.

And then, for example:

using (FileStream fs = new FileStream("SampleBatch.sql", FileMode.Open, FileAccess.Read))
{
    foreach (string statement in ParseSqlBatch(fs))
    {
        //execute statement here, or do something with it
    }

    fs.Close();
}

I hope it will help someone.

like image 38
i-one Avatar answered Oct 20 '22 17:10

i-one