Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace Tag in HTML with DOMDocument

Tags:

dom

php

I'm trying to edit html tags with DOMDocument::loadHTML in php. The html data is a part of html and not the whole page. I followed what this page (PHP - DOMDocument - need to change/replace an existing HTML tag w/ a new one) says.

This should convert pre tags into div tags but it gives "Fatal error: Uncaught exception 'DOMException' with message 'Not Found Error'."

<?php
$contents = <<<STR
<pre>hi</pre>
<pre>hello</pre>
<pre>bye</pre>
STR;

$dom = new DOMDocument;
@$dom->loadHTML($contents);

foreach( $dom->getElementsByTagName("pre") as $nodePre ) {
    $nodeDiv = $dom->createElement("div", $nodePre->nodeValue);
    $dom->replaceChild($nodeDiv, $nodePre);
}

echo $dom->saveHTML();
?>

[Edit] While I'm trying to iterate the node object backwards, I get this error, 'Notice: Trying to get property of non-object...'

<?php
$contents = <<<STR
<pre>hi</pre>
<pre>hello</pre>
<pre>bye</pre>
STR;

$dom = new DOMDocument;
@$dom->loadHTML($contents);
$domPre = $dom->getElementsByTagName('pre');
$length = $domPre->length;

    For ($i = $length; $i > -1 ; $i--) {
        $nodePre = $domPre->item($i);
        echo $nodePre->nodeValue . '<br />';
//      $nodeDiv = $dom->createElement("div", $nodePre->nodeValue);
//      $dom->replaceChild($nodeDiv, $nodePre);
    }

    // echo $dom->saveHTML();
?>

[Edit] Okey, solved. Since the answered code has some error I post the solution here. Thanks all.

Solution:

<?php
$contents = <<<STR
<pre>hi</pre>
<pre>hello</pre>
<pre>bye</pre>
STR;

$dom = new DOMDocument;
@$dom->loadHTML($contents);
$domPre = $dom->getElementsByTagName('pre');
$length = $domPre->length;

For ($i = $length - 1; $i > -1 ; $i--) {
    $nodePre = $domPre->item($i);
    $nodeDiv = $dom->createElement("div", $nodePre->nodeValue);
    $nodePre->parentNode->replaceChild($nodeDiv, $nodePre);
}

echo $dom->saveHTML();
?>
like image 318
Teno Avatar asked Aug 18 '12 12:08

Teno


2 Answers

The problem is the call to replaceChild(). Rather than

$dom->replaceChild($nodeDiv, $nodePre);

use

$nodePre->parentNode->replaceChild($nodeDiv, $nodePre);

update

Here is a working code. Seems there is some issue with replacing multiple nodes (more info here: http://php.net/manual/en/domnode.replacechild.php) so you'll have to use a regressive loop to replace the elements.

$contents = <<<STR
<pre>hi</pre>
<pre>hello</pre>
<pre>bye</pre>
STR;

$dom = new DOMDocument;
@$dom->loadHTML($contents);

$elements = $dom->getElementsByTagName("pre");
for ($i = $elements->length - 1; $i >= 0; $i --) {
    $nodePre = $elements->item($i);
    $nodeDiv = $dom->createElement("div", $nodePre->nodeValue);
    $nodePre->parentNode->replaceChild($nodeDiv, $nodePre);
}
like image 97
Czar Pino Avatar answered Nov 19 '22 02:11

Czar Pino


Another way with paquettg/php-html-parser (didn't find the way to change name, so had to use hack with re-binding $this):

use PHPHtmlParser\Dom;
use PHPHtmlParser\Dom\HtmlNode;

$dom = new Dom;
$dom->load($text);
/** @var HtmlNode[] $tags */
foreach($dom->find('pre') as $tag) {
    $changeTag = function() {
        $this->name = 'div';
    };
    $changeTag->call($tag->tag);
};
echo (string)$dom;
like image 3
Slava V Avatar answered Nov 19 '22 02:11

Slava V