Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DOM: fetch all text nodes in the document (PHP)

I've have the following (PHP) code that traverses an entire DOM document to get all of the text nodes. It's a bit of a ugly solution, and I'm sure there must be a better way... so, is there?

$skip = false;
$node = $document;
$nodes = array();
while ($node) {
    if ($node->nodeType == 3) {
        $nodes[] = $node;
    }
    if (!$skip && $node->firstChild) {
        $node = $node->firstChild;
    } elseif ($node->nextSibling) {
        $node = $node->nextSibling;
        $skip = false;
    } else {
        $node = $node->parentNode;
        $skip = true;
    }
}

Thanks.

like image 902
Jack Sleight Avatar asked Apr 20 '09 15:04

Jack Sleight


1 Answers

The XPath expression you need is //text(). Try using it with DOMXPath::query. For example:

$xpath = new DOMXPath($doc);
$textnodes = $xpath->query('//text()');
like image 111
Rob Kennedy Avatar answered Nov 09 '22 17:11

Rob Kennedy