Here is my code:
from lxml import html
import requests
page = requests.get('https://en.wikipedia.org/wiki/Nabucco')
tree = html.fromstring(page.content)
title = tree.xpath('//*[@id="mw-content-text"]/table[1]/tbody/tr[1]/th/i')
print(title)
Problem: print(title) prints "[]", empty list. I expect this to print "Nabucco". The XPath expression is from Chrome inspector "Copy XPath" function.
Why isn't this working? Is there a disagreement between lxml and Chrome's xpath engine? Or am I missing something? I am somewhat new to python, lxml and xpath.
That's because of the tbody
tag. You see it in the browser since the tag was inserted by the browser. requests
is not a browser and just downloads the page source as is:
Replace:
//*[@id="mw-content-text"]/table[1]/tbody/tr[1]/th/i
with:
//*[@id="mw-content-text"]/table[1]/tr[1]/th/i
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With