Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extracting information from HTML using XPath

Tags:

html

xpath

I have a snippet of html which I extracted from the source of a webpage I'm working on:

<span itemprop="homeLocation" itemscope itemtype="http://schema.org/Place"><meta itemprop="name" content="Kansas"/>

...and I'd like to extract the location, Kansas from it, using Xpath.

Using an Xpath checker, I have been testing this but to no avail.

I tried

//*[@itemprop="homeLocation"]/meta[@itemprop="name"]/@content

and similar attempts, but can't seem to get a match. I don't understand what I'm doing wrong.

Any advice would be greatly appreciated.

like image 472
tumultous_rooster Avatar asked Jun 29 '26 17:06

tumultous_rooster


1 Answers

Your xPath is absolutely valid. The problems are with xml.

  1. Close span tag.
  2. Set some value for itemscope attribute.

And the most important. xPath checker your are trying to use seems to have some bugs. Check this one: http://www.freeformatter.com/xpath-tester.html#ad-output

Xml I've used:

    <span 
      itemprop="homeLocation"
      itemscope=""
      itemtype="http://schema.org/Place">
             <meta itemprop="name" content="Kansas"/>
  </span>

Result:

Attribute='content="Kansas"'
like image 85
vvg Avatar answered Jul 01 '26 07:07

vvg



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!