I am using lxml to extract data from web pages, but I am unable to convert the resulting ElementUnicode object to a string. Here is my code:
from lxml import html
from lxml import etree
from lxml.etree import tostring
url = 'https://www.imdb.com/title/tt5848272/?pf_rd_m=A2FGELUUNOQJNL&pf_rd_p=2413b25e-e3f6-4229-9efd-599bb9ab1f97&pf_rd_r=9S5A89ZHEXE4K8SZBC40&pf_rd_s=right-2&pf_rd_t=15061&pf_rd_i=homepage&ref_=hm_otw_t0'
page = requests.get('url')
tree = html.fromstring(page.content)
a = tree.xpath('//div[@class="credit_summary_item"]/a[../h4/text() = "Directors:"]/text()')
mynewlist = []
for i in a:
b = etree.tostring(i, method="text")
mynewlist.append(b)
Here is the error I get:
TypeError: Type 'lxml.etree._ElementUnicodeResult' cannot be serialized.
Any help would be greatly appreciated.
I too had trouble converting 'lxml.etree._ElementUnicodeResult'
to string.
Then i found the following link.
https://lxml.de/api/lxml.etree._ElementUnicodeResult-class.html
You can see that _ElementUnicodeResult
has inherited a lot of functions from unicode
.
I used __str__()
function, which converted it to string type.
It also supports a number of other string operations directly. You can check in the link. Hope this helps ;)
The i
variable is an _ElementUnicodeResult
object (a special type of string). You cannot use it as an argument to tostring()
.
The a
variable (the result of the XPath evaluation) is the list of strings that you want. If the elements of this list must be plain strings instead of _ElementUnicodeResult
objects, you can use a list comprehension:
newlist = [str(s) for s in a]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With