I'm trying to iterate over a directory which contains loads of PHP files, and detect what classes are defined in each file.
Consider the following:
$php_files_and_content = new PhpFileAndContentIterator($dir);
foreach($php_files_and_content as $filepath => $sourceCode) {
// echo $filepath, $sourceCode
}
The above $php_files_and_content
variable represents an iterator where the key is the filepath, and the content is the source code of the file (as if that wasn't obvious from the example).
This is then supplied into another iterator which will match all the defined classes in the source code, ala:
class DefinedClassDetector extends FilterIterator implements RecursiveIterator {
public function accept() {
return $this->hasChildren();
}
public function hasChildren() {
$classes = getDefinedClasses($this->current());
return !empty($classes);
}
public function getChildren() {
return new RecursiveArrayIterator(getDefinedClasses($this->current()));
}
}
$defined_classes = new RecursiveIteratorIterator(new DefinedClassDetector($php_files_and_content));
foreach($defined_classes as $index => $class) {
// print "$index => $class"; outputs:
// 0 => Class A
// 1 => Class B
// 0 => Class C
}
The reason the $index
isn't sequential numerically is because 'Class C' was defined in the second source code file, and thus the array returned starts from index 0 again. This is preserved in the RecursiveIteratorIterator because each set of results represents a separate Iterator (and thus key/value pairs).
Anyway, what I am trying to do now is find the best way to combine these, such that when I iterate over the new iterator, I can get the key is the class name (from the $defined_classes
iterator) and the value is the original file path, ala:
foreach($classes_and_paths as $filepath => $class) {
// print "$class => $filepath"; outputs
// Class A => file1.php
// Class B => file1.php
// Class C => file2.php
}
And that's where I'm stuck thus far.
At the moment, the only solution that is coming to mind is to create a new RecursiveIterator, that overrides the current() method to return the outer iterator key() (which would be the original filepath), and key() method to return the current iterator() value. But I'm not favouring this solution because:
Any ideas or suggestions gratefully recieved.
I also realise there are far faster, more efficient ways of doing this, but this is also an exercise in using Iterators for myselfm and also an exercise in promoting code reuse, so any new Iterators that have to be written should be as minimal as possible and try to leverage existing functionality.
Thanks
OK, I think I finally got my head around this. Here's roughly what I did in pseudo-code:
Step 1 We need to list the directory contents, thus we can perform the following:
// Reads through the $dir directory
// traversing children, and returns all contents
$dirIterator = new RecursiveDirectoryIterator($dir);
// Flattens the recursive iterator into a single
// dimension, so it doesn't need recursive loops
$dirContents = new RecursiveIteratorIterator($dirIterator);
Step 2 We need to consider only the PHP files
class PhpFileIteratorFilter {
public function accept() {
$current = $this->current();
return $current instanceof SplFileInfo
&& $current->isFile()
&& end(explode('.', $current->getBasename())) == 'php';
}
}
// Extends FilterIterator, and accepts only .php files
$php_files = new PhpFileIteratorFilter($dirContents);
The PhpFileIteratorFilter isn't a great use of re-usable code. A better method would have been to be able to supply a file extension as part of the construction and get the filter to match on that. Although that said, I am trying to move away from construction arguments where they are not required and rely more on composition, because that makes better use of the Strategy pattern. The PhpFileIteratorFilter could simply have used the generic FileExtensionIteratorFilter and set itself up interally.
Step 3 We must now read in the file contents
class SplFileInfoReader extends FilterIterator {
public function accept() {
// make sure we use parent, this one returns the contents
$current = parent::current();
return $current instanceof SplFileInfo
&& $current->isFile()
&& $current->isReadable();
}
public function key() {
return parent::current()->getRealpath();
}
public function current() {
return file_get_contents($this->key());
}
}
// Reads the file contents of the .php files
// the key is the file path, the value is the file contents
$files_and_content = new SplFileInfoReader($php_files);
Step 4
Now we want to apply our callback to each item (the file contents) and somehow retain the results. Again, trying to make use of the strategy pattern, I've done away unneccessary contructor arguments, e.g. $preserveKeys
or similar
/**
* Applies $callback to each element, and only accepts values that have children
*/
class ArrayCallbackFilterIterator extends FilterIterator implements RecursiveIterator {
public function __construct(Iterator $it, $callback) {
if (!is_callable($callback)) {
throw new InvalidArgumentException('$callback is not callable');
}
$this->callback = $callback;
parent::__construct($it);
}
public function accept() {
return $this->hasChildren();
}
public function hasChildren() {
$this->results = call_user_func($this->callback, $this->current());
return is_array($this->results) && !empty($this->results);
}
public function getChildren() {
return new RecursiveArrayIterator($this->results);
}
}
/**
* Overrides ArrayCallbackFilterIterator to allow a fixed $key to be returned
*/
class FixedKeyArrayCallbackFilterIterator extends ArrayCallbackFilterIterator {
public function getChildren() {
return new RecursiveFixedKeyArrayIterator($this->key(), $this->results);
}
}
/**
* Extends RecursiveArrayIterator to allow a fixed $key to be set
*/
class RecursiveFixedKeyArrayIterator extends RecursiveArrayIterator {
public function __construct($key, $array) {
$this->key = $key;
parent::__construct($array);
}
public function key() {
return $this->key;
}
}
So, here I have my basic iterator which will return the results of the $callback
I supplied through, but I've also extended it to create a version that will preserve the keys too, rather than using a constructor argument for it.
And thus we have this:
// Returns a RecursiveIterator
// key: file path
// value: class name
$class_filter = new FixedKeyArrayCallbackFilterIterator($files_and_content, 'getDefinedClasses');
Step 5 Now we need to format it into a suitable manner. I desire the file paths to be the value, and the keys to be the class name (i.e. to provide a direct mapping for a class to the file in which it can be found for the auto loader)
// Reduce the multi-dimensional iterator into a single dimension
$files_and_classes = new RecursiveIteratorIterator($class_filter);
// Flip it around, so the class names are keys
$classes_and_files = new FlipIterator($files_and_classes);
And voila, I can now iterate over $classes_and_files
and get a list of all defined classes under $dir, along with the file they're defined in. And pretty much all of the code used to do this is re-usable in other contexts as well. I haven't hard-coded anything in the defined Iterator to achieve this task, nor have I done any extra processing outside the iterators
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With