Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing/Tokenizing a String Containing a SQL Command

Are there any open source libraries (any language, python/PHP preferred) that will tokenize/parse an ANSI SQL string into its various components?

That is, if I had the following string

 SELECT a.foo, b.baz, a.bar
 FROM TABLE_A a
 LEFT JOIN TABLE_B b
 ON a.id = b.id
 WHERE baz = 'snafu';

I'd get back a data structure/object something like

 //fake PHPish 
 $results['select-columns']  = Array[a.foo,b.baz,a.bar];
 $results['tables']    = Array[TABLE_A,TABLE_B];
 $results['table-aliases'] = Array[a=>TABLE_A, b=>TABLE_B];
 //etc...

Restated, I'm looking for the code in a database package that teases the SQL command apart so that the engine knows what to do with it. Searching the internet turns up a lot of results on how to parse a string WITH SQL. That's not what I want.

I realize I could glop through an open source database's code to find what I want, but I was hoping for something a little more ready made, (although if you know where in the MySQL, PostgreSQL, SQLite source to look, feel free to pass it along)

Thanks!

like image 587
Alan Storm Avatar asked Mar 13 '10 19:03

Alan Storm


1 Answers

SQLite source has a file named parse.y that contains grammar for SQL. You can pass that file to lemon parser generator to generate C code that executes the grammar.

like image 84
ardsrk Avatar answered Sep 22 '22 12:09

ardsrk