Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find whitespace-injecting PHP files

Tags:

regex

grep

bash

php

Whitespace in PHP files is sometimes problematic, so I'm trying to find files which meet common problematic criteria. I'm trying to find all files recursively which have one or both of these conditions:

1) Does not begin with a < or # character.

and/or

2) Does not end in a > character, unless it does end in a close brace which is followed by any amount of newlines.

I think that the first condition would be: $[^<#]

I think that the second condition would be: [ [^>^] | [}\n*^]]

However, note that in my naive regexes $ and ^ represent the start and end of the file, not of any line in the file. And even with those, assuming that they were correct, how would I combine them? Like so?

[$[^<#]] | [[ [^>^] | [}\n*^]]]

Then, putting them in grep:

grep -r [$[^<#]] | [[ [^>^] | [}\n*^]]] *

Obviously, this is Not Working (tm). Can someone teach me how to correct the mistakes? Thanks.

This is a good file:

<?php

?>

So is this:

<?php
function someFunc(){
}


‏

And this is good too:

#!/usr/bin/php -q
<?php
?>

Leading HTML is fine:

<html>
<?php
echo '</html>';
?>

Trailing HTML is fine too:

<?php
echo '<html>';
?>
</html>

This is bad (leading newline):

‏
<?php

?>

This is bad too (leading space):

‏ <?php

?>

This is bad as well (trailing newline):

<?php

?>
‏
like image 383
dotancohen Avatar asked Sep 13 '12 22:09

dotancohen


Video Answer


1 Answers

Tossed up an expression real quick that I think does what you want. It's pretty late here and for some reason I'm on stackoverflow. Regardless, I hope I got your request right.

Try this regular expression /\A(?:\s+.*>|[^<#].*>\s*|<.*>\s+)\Z/s. Explained here: http://regex101.com/r/cT7eY5

I hope this help. If I misunderstood you in any way, please clarify and I will try to adjust the expression.

like image 136
Firas Dib Avatar answered Oct 04 '22 05:10

Firas Dib