Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to grab the contents of HTML tags?

Hey so what I want to do is snag the content for the first paragraph. The string $blog_post contains a lot of paragraphs in the following format:

<p>Paragraph 1</p><p>Paragraph 2</p><p>Paragraph 3</p>

The problem I'm running into is that I am writing a regex to grab everything between the first <p> tag and the first closing </p> tag. However, it is grabbing the first <p> tag and the last closing </p> tag which results in me grabbing everything.

Here is my current code:

if (preg_match("/[\\s]*<p>[\\s]*(?<firstparagraph>[\\s\\S]+)[\\s]*<\\/p>[\\s\\S]*/",$blog_post,$blog_paragraph))
   echo "<p>" . $blog_paragraph["firstparagraph"] . "</p>";
else
  echo $blog_post;
like image 353
Andrew G. Johnson Avatar asked Sep 02 '08 01:09

Andrew G. Johnson


1 Answers

Well, sysrqb will let you match anything in the first paragraph assuming there's no other html in the paragraph. You might want something more like this

<p>.*?</p>

Placing the ? after your * makes it non-greedy, meaning it will only match as little text as necessary before matching the </p>.

like image 81
Kibbee Avatar answered Sep 21 '22 05:09

Kibbee