I build a small script that supposed to find some specific string in a page and return the xpath of the element containing this string. The purpose is to use this xpath for finding string with same context.
I'm using this code:
import requests
from lxml import html
page = requests.get("http://www.w3schools.com/xpath/")
tree = html.fromstring(page.text)
result = tree.xpath('//*[. = "XML"]')
result[0]
returns <Element b at 0x7f034a08e940>
and I can't figure out how to find this element's XPath anyway .
The string I would like to have is:
/html/body/div[4]/div/div[2]/div[2]/div[1]/div/ul/li[2]
You can use getpath()
to get xpath from element
, for example :
import requests
from lxml import html
page = requests.get("http://www.w3schools.com/xpath/")
root = html.fromstring(page.text)
tree = root.getroottree()
result = root.xpath('//*[. = "XML"]')
for r in result:
print(tree.getpath(r))
Output :
/html/body/div[3]/div/ul/li[10]
/html/body/div[3]/div/ul/li[10]/a
/html/body/div[4]/div/div[2]/div[2]/div[1]/div/ul/li[2]
/html/body/div[5]/div/div[6]/h3
/html/body/div[6]/div/div[4]/h3
/html/body/div[7]/div/div[4]/h3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With