Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Separate multiple SQL statements using Regex in Java

Tags:

java

regex

Given a string that may contain one or more SQL statements separated by ";", for example:

String sql = "select * from table1 where col1 = 'abc;de'; select * from table2;";

I need to obtain in an array of strings the statements:

array[0] = "select * from table1 where col1 = 'abc;de';"
array[1] = "select * from table2;"

Note that semicolons could appear between apostrophes.

For example, using regex:

String regex = "???";  // <--- I can't figure out this one
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(sql);

What would be the regex to accomplish this?

like image 516
ps0604 Avatar asked Mar 03 '15 22:03

ps0604


2 Answers

You can try the following regular expression:

\s*;\s*(?=([^']*'[^']*')*[^']*$)

Visual Regex

Here is the example:

public static void main(String[] args) {
    String input = "select * from table1 where col1 = 'abc;de'; select * from table2;";

    System.out.println(java.util.Arrays.toString(
        input.split("\\s*;\\s*(?=([^']*'[^']*')*[^']*$)")
    )); // prints "[select * from table1 where col1 = 'abc;de', select * from table2]"

}

Explanation of the regular expression:

NODE                     EXPLANATION
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  ;                        ';'
--------------------------------------------------------------------------------
  \s*                      whitespace (\n, \r, \t, \f, and " ") (0 or
                           more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?=                      look ahead to see if there is:
--------------------------------------------------------------------------------
    (                        group and capture to \1 (0 or more times
                             (matching the most amount possible)):
--------------------------------------------------------------------------------
      [^']*                    any character except: ''' (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      '                        '\''
--------------------------------------------------------------------------------
      [^']*                    any character except: ''' (0 or more
                               times (matching the most amount
                               possible))
--------------------------------------------------------------------------------
      '                        '\''
--------------------------------------------------------------------------------
    )*                       end of \1 (NOTE: because you are using a
                             quantifier on this capture, only the
                             LAST repetition of the captured pattern
                             will be stored in \1)
--------------------------------------------------------------------------------
    [^']*                    any character except: ''' (0 or more
                             times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    $                        before an optional \n, and the end of
                             the string
--------------------------------------------------------------------------------
  )                        end of look-ahead
like image 168
Paul Vargas Avatar answered Sep 22 '22 19:09

Paul Vargas


I come in a while after the battle, but if this can help someone...

The most simple way I found out is to rely on IBatis migration tool that has a batch file processing tool which internally splits the file into its statements.

Internally, in the ScriptRunner class, they use :

private static final Pattern DELIMITER_PATTERN = Pattern.compile("^\\s*((--)|(//))?\\s*(//)?\\s*@DELIMITER\\s+([^\\s]+)", 2);
like image 31
Benj Avatar answered Sep 19 '22 19:09

Benj