I am trying to use the tokenizer to scan a file to find all the defined classes, anything they extend, any created instances, and anytime they were statically invoked.
<?php
$tokens = token_get_all(file_get_contents($file));
$used_classes = array();
$defined_classes = array();
$variable_classes = array();
foreach($tokens as $i => $token) {
if(is_array($token)) {
if(isset($tokens[$i - 2][0], $tokens[$i - 1][0])) {
// new [class]
if ($tokens[$i - 2][0] == T_NEW AND $tokens[$i - 1][0] == T_WHITESPACE) {
if($tokens[$i][0] == T_STRING) {
$used_classes[$token[1]] = TRUE;
// new $variable()
} elseif($tokens[$i][0] == T_VARIABLE) {
// @todo, this is really broken. However, do best to look for the assignment
if(preg_match('~\$var\s*=\s*([\'"])((?:(?!\1).)*)\1~', $text, $match)) {
if(empty($extension_classes[$match[2]])) {
$used_classes[$match[2]] = TRUE;
}
} elseif($token[1] !== '$this') {
$variable_classes[$token[1]] = TRUE;
}
}
}
// class [class]
if ($tokens[$i - 2][0] == T_CLASS AND $tokens[$i - 1][0] == T_WHITESPACE) {
if($tokens[$i][0] == T_STRING) {
$defined_classes[$token[1]] = TRUE;
}
}
// @todo: find more classes \/
// class [classname] extends [class] ???
// [class]::method()???
}
}
}
How can I extend this code to find any additional instances of PHP classes like mentioned above?
Parsing and then interpreting PHP code is not something that can be solved well using a regex. You would need a something much more clever, like a state machine, that can actually understand things like scope, class names, inheritance etc to be able to do what you want.
It just so happens, that I happen to have written a PHP-to-Javascript converter based on a state-machine that will almost do most of what you want to do:
all the defined classes
Yes, all the classes create a ClassScope with all their variables listed and their methods are created as FunctionScope's, so you can tell which methods a class has.
anything they extend
Yes, every class has it's parent classes listed in ClassScope->$parentClasses
any created instances
Nope, but wouldn't be hard to add extra code to record these.
anytime they were statically invoked.
Nope - but that actually could be done with a regex.
Although it doesn't exactly solve your problem, the project as it stands would get you 95% of the way towards what you want to do, which would save a couple weeks work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With