Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

href attribute for lxml.html

according to this answer:

>>> from lxml.html import fromstring
>>> s = """<input type="hidden" name="question" value="1234">"""
>>> doc = fromstring(s)
>>> doc.value
'1234'
>>> doc.name
'question'

I tried to get both the link and the text from this code:

from lxml.html import fromstring
s = '<a href="http://a.com" rel="bookmark">bla bla bla</a>'
doc = fromstring(s)
print (doc.href)
print (doc.text_content())

It gives a AttributeError:'HtmlElement' object has no attribute 'href'

Im new in lxml. Actually what was the problem?

How can i have both the link (a.com) and the text (bla bla bla) as strings from this code?

like image 711
nazmus saif Avatar asked Feb 11 '23 09:02

nazmus saif


1 Answers

This code works for me

from lxml.html import document_fromstring
doc = document_fromstring('<a href="http://a.com" rel="bookmark">bla bla bla</a>')
print (doc.xpath("//a")[0].get("href"))
print (doc.text_content())

Output:

http://a.com
bla bla bla
like image 72
Valeriy Gaydar Avatar answered Apr 28 '23 12:04

Valeriy Gaydar