I have to extract column names from a MYSQL SELECT
and I'd wish to do that using Regex.
It's a plain SELECT
, something like:SELECT column1, column2 ... FROM table
I have to cover every cases, with our without alias, with or without table in front of it, with or without the quoting char:
SELECT column, column as foo, table.column, table.column as foo,
`column`, `column` as foo, `table`.`column`, `table`.`column` as foo
.....
Currently I've been able to work out this regex: #\w+(\sas)?#i
but it's not good vs prefixed columns.
Any help?
By the way, is Regex good at this task?
EDIT
Thanks for the answers!
The patterns you posted are valid for the whole query, actually I'm already processing every single column:
$fields = Frameworkmethod::getSelectFields($query);
$columns = explode(',' , $fields);
foreach($columns as $column)
{
//do Regex work to "clean up" the single field and get the "standard" one (not the alias)
//`#__tracktime_projects`.`pr_name` AS `project_name` should return pr_name
}
As stated in the comment above, I always need the field name, not the alias one. Sorry for not pointing it out before!
I made use of Collapse and Capture a Repeating Pattern in a Single Regex Expression and adapted it to fit this purpose.
So, a hopefully bulletproof RegEx for capturing column names from a *SQL query :
/(?:SELECT\s++(?=(?:[#\w,`.]++\s++)+)|(?!^)\G\s*+,\s*+(?:`?+\s*+[#\w]++\s*+`?+\s*+\.\s*+)?+`?+\s*+)(\w++)`?+(?:\s++as\s++[^,\s]++)?+/ig
Explained Online demo: http://regex101.com/r/wL7yA9
PHP code using preg_match_all() with single RegEx, commented with /x
modifier:
preg_match_all('/(?:SELECT\s++(?=(?:[\#\w,`.]++\s++)+) # start matching on SELECT
| # or
(?!^)\G # resume from last match position
\s*+,\s*+ # delimited by a comma
(?:`?+\s*+ # optional prefix table with optional backtick
[\#\w]++ # table name
\s*+`?+ # optional backtick
\s*+\.\s*+ # dot separator
)?+ # optional prefix table end group
`?+\s*+ # optional backtick
) # initial match or subsequent match
(\w++) # capturing group
`?+ # optional backtick
(?:\s++as\s++[^,\s]++)?+ # optional alias
/ix', $query, $matches);
Live code: http://codepad.viper-7.com/VTaPd3
Note: the 'hopefully bulletproof' is aimed at valid SQL
PHP code using explode()
$columns = explode(',', $fields);
foreach($columns as $column)
{
$regex='/([\w]++)`?+(?:\s++as\s++[^,\s]++)?+\s*+(?:FROM\s*+|$)/i';
preg_match($regex, $column, $match);
print $match[1]; // field stored in $match[1]
}
Live code with example extraction: http://codepad.viper-7.com/OdUGXd
I used PHP:
$query = 'SELECT column1, column2 as foo, table.column3, table.column4 as foo,
`column5`, `column6` as foo, `table`.`column7`, `table`.`column8` as foo
FROM table';
$query = preg_replace('/^SELECT(.*?)FROM.*$/s', '$1', $query); // To remove the "SELECT" and "FROM table..." parts
preg_match_all('/(?:
(?:`?\w+`?\.)? (?:`)?(\w+)(?:`)? (?:\s*as\s*\w+)?\s*
# ^--TableName-^ ^---ColumnName--^ ^----AsFoo-----^
)+/x',$query, $m);
print_r($m[1]);
Output:
Array
(
[0] => column1
[1] => column2
[2] => column3
[3] => column4
[4] => column5
[5] => column6
[6] => column7
[7] => column8
)
Live demo: http://www.rubular.com/r/H960NFKCTr
UPDATE: Since you're using some "unusual" but valid SQL table names (e.g.: #__tracktime_projects
) it has messed up the regex. So to fix this issue, I added a variable which contains what characters we would expect, I also added the i
modifier to make the match caseless:
$query = 'SELECT column1, column2 as foo, table.column3, table.column4 as foo,
`column5`, `column6` as foo, `table`.`column7`, `table`.`column8` as foo, `#__tracktime_projects`.`pr_name` AS project_name, `#wut`
FROM table';
$query = preg_replace('/^SELECT(.*?)FROM.*$/s', '$1', $query); // To remove the "SELECT" and "FROM table..." parts
$allowed = '\w#'; // Adjust this to the names that you expect.
preg_match_all('/(?:
(?:`?['.$allowed.']++`?\.)?
# ^--------TableName--------^
(?:`)?(['.$allowed.']++)(?:`)?
# ^----------ColumnName--------^
(?:\s*as\s*['.$allowed.']++)?\s*
# ^-------------AsFoo------------^
)+
/xi',$query, $m);
print_r($m[1]);
Output:
Array
(
[0] => column1
[1] => column2
[2] => column3
[3] => column4
[4] => column5
[5] => column6
[6] => column7
[7] => column8
[8] => pr_name
[9] => #wut
)
Live demo: http://www.rubular.com/r/D0iIHJQwB8
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With