Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

retrieve xpath content from div id

Tags:

html

xpath

How do I retrieved the text inside article-field1?

<title>Testing</title>
  <link>http://example.org</link>
  <description>Description</description>
  <language>en-us</language>
  <lastBuildDate>Mon, 13 Feb 2012 00:00:00 +0000</lastBuildDate>

  <item>
    <title>Title Here</title>
    <link>http://example.org/2012/03/27/</link>
    <description><![CDATA[
        <div id="article-field1"><a href="http://example.org/test1">Test 1</a></div>
        <div id="article-field2">123</div>
    <pubDate>Tue, 2 Mar 2012 00:00:00 +0000</pubDate>
  </item>

I've tried to use

//description/div[@id="article-field1"]/text()

Any advise?

Thanks

like image 438
shadow Avatar asked Feb 15 '12 07:02

shadow


1 Answers

From what I see your data are in a CDATA tag. This prevents parsing its content.

See How do I retrieve element text inside CDATA markup via XPath? for more details.

like image 158
Olivier.Roger Avatar answered Oct 19 '22 10:10

Olivier.Roger