Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regexp for html [duplicate]

Tags:

html

regex

php

Possible Duplicate:
RegEx match open tags except XHTML self-contained tags

I have the following string:

$str = " 
<li>r</li>  
<li>a</li>  
<li>n</li>  
<li>d</li>  
...
<li>om</li>  
";

How do I get the HTML for the first n-th <li> tags?

Ex : n = 3 ; result = "<li>r<...>n</li>;

I would like a regexp if possible.

like image 787
johnlemon Avatar asked Aug 30 '10 20:08

johnlemon


2 Answers

Like this.

$dom = new DOMDocument();
@$dom->loadHTML($str);
$x = new DOMXPath($dom); 

// we wan the 4th node.
foreach($x->query("//li[4]") as $node) 
{
  echo $node->c14n()
}

Oh yeah, learn xpath, it will save you lots of trouble in the future.

like image 185
Byron Whitlock Avatar answered Oct 02 '22 09:10

Byron Whitlock


The Solution of @Byron but with SimpleXML:

$xml = simplexml_load_string($str);

foreach($xml->xpath("//li[4]") as $node){
  echo $node[0]; // The first element is the text node
}

EDIT: Another reason I really like at simplexml is the easy debugging of the content of a node. You can just use print_r($xml) to print the object with it's child nodes.

like image 40
2ndkauboy Avatar answered Oct 02 '22 09:10

2ndkauboy