Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove php code from a string?

I have a string that has php code in it, I need to remove the php code from the string, for example:

<?php $db1 = new ps_DB() ?><p>Dummy</p>

Should return <p>Dummy</p>

And a string with no php for example <p>Dummy</p> should return the same string.

I know this can be done with a regular expression, but after 4h I haven't found a solution.

like image 849
Gonzalo Avatar asked Jul 15 '10 18:07

Gonzalo


People also ask

How can I remove part of a string in PHP?

The substr() and strpos() function is used to remove portion of string after certain character.

How do I remove a word from a string in PHP?

Answer: Use the PHP str_replace() function You can use the PHP str_replace() function to replace all the occurrences of a word within a string.

How do I strip HTML tags in PHP?

The strip_tags() function strips a string from HTML, XML, and PHP tags. Note: HTML comments are always stripped. This cannot be changed with the allow parameter.

How remove all special characters from a string in PHP?

Using str_replace() Method: The str_replace() method is used to remove all the special characters from the given string str by replacing these characters with the white space (” “).


2 Answers

 <?php
 function filter_html_tokens($a){
    return is_array($a) && $a[0] == T_INLINE_HTML ?
      $a[1]:
      '';
 }
 $htmlphpstring = '<a>foo</a> something <?php $db1 = new ps_DB() ?><p>Dummy</p>';
 echo implode('',array_map('filter_html_tokens',token_get_all($htmlphpstring)));
 ?>

As ircmaxell pointed out: this would require valid PHP!

A regex route would be (allowing for no 'php' with short tags. no ending ?> in the string / file (for some reason Zend recommends this?) and of course an UNgreedy & DOTALL pattern:

preg_replace('/<\\?.*(\\?>|$)/Us', '',$htmlphpstring);
like image 120
Wrikken Avatar answered Sep 26 '22 03:09

Wrikken


Well, you can use DomDocument to do it...

function stripPHPFromHTML($html) {
    $dom = new DomDocument();
    $dom->loadHtml($html);
    removeProcessingInstructions($dom);
    $simple = simplexml_import_dom($d->getElementsByTagName('body')->item(0));
    return $simple->children()->asXml();
}

function removeProcessingInstructions(DomNode &$node) {
    foreach ($node->childNodes as $child) {
        if ($child instanceof DOMProcessingInstruction) {
            $node->removeChild($child);
        } else {
            removeProcessingInstructions($child);
        }
    }
}

Those two functions will turn

$str = '<?php echo "foo"; ?><b>Bar</b>';
$clean = stripPHPFromHTML($str);
$html = '<b>Bar</b>';

Edit: Actually, after looking at Wrikken's answer, I realized that both methods have a disadvantage... Mine requires somewhat valid HTML markup (Dom is decent, but it won't parse <b>foo</b><?php echo $bar). Wrikken's requires valid PHP (any syntax errors and it'll fail). So perhaps a combination of the two (try one first. If it fails, try the other. If both fail, there's really not much you can do without trying to figure out the exact reason they failed)...

like image 32
ircmaxell Avatar answered Sep 26 '22 03:09

ircmaxell