Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I grab the entire content inside `<body>` tag with regex?

How can I grab the entire content inside <body> tag with regex?

For instance,

<html><body><p><a href="#">xx</a></p>

<p><a href="#">xx</a></p></body></html> 

I want to return this only,

<p><a href="#">xx</a></p>

<p><a href="#">xx</a></p>

Or any other better ideas? maybe DOM but I have to use saveHTML(); then it will return doctype and body tag...

HTML Purifier is a pain to use so I decide not to use it. I thought regex could be the next best option for my disaster.

like image 324
Run Avatar asked Jul 31 '11 20:07

Run


2 Answers

preg_match("/<body[^>]*>(.*?)<\/body>/is", $html, $matches);

$matches[1] will be the contents of the body tag

like image 81
Flambino Avatar answered Nov 07 '22 09:11

Flambino


preg_match("~<body.*?>(.*?)<\/body>~is", $html, $match);
print_r($match);
like image 37
genesis Avatar answered Nov 07 '22 09:11

genesis