I'm using Google App Engine with Python. I want to get the tree of a HTML file from the same project as my Python script. I tried many things, like using the absolute url (e.g http://localhost:8080/nl/home.html) and the relative url (/nl/home.html). Both don't seem to work. I use this code:
class HomePage(webapp2.RequestHandler):
def get(self):
path = self.request.path
htmlfile = etree.parse(path)
template = jinja_environment.get_template('/nl/template.html')
pagetitle = htmlfile.find(".//title").text
body = htmlfile.get_element_by_id("body").toString()
It returns the following error: IOError: Error reading file '/nl/home.html': failed to load external entity "/nl/home.html
Does anyone know how to get the tree of a HTML file from the same project with Python?
EDIT
This is the working code:
class HomePage(webapp2.RequestHandler):
def get(self):
path = self.request.path.replace("/","",1)
logging.info(path)
htmlfile = html.fromstring(urllib.urlopen(path).read())
template = jinja_environment.get_template('/nl/template.html')
pagetitle = htmlfile.find(".//title").text
body = innerHTML(htmlfile.get_element_by_id("body"))
def innerHTML(node):
buildString = ''
for child in node:
buildString += html.tostring(child)
return buildString
Your working directory is the base of your app directory. So if your app is organized like:
You can then read your file at nl/html.html (assuming you haven't changed your working directory).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With