I have a raw html string that I want to convert to scrapy HTML response object so that I can use the selectors css
and xpath
, similar to scrapy's response
. How can I do it?
First of all, if it is for debugging or testing purposes, you can use the Scrapy shell
:
$ cat index.html <div id="test"> Test text </div> $ scrapy shell index.html >>> response.xpath('//div[@id="test"]/text()').extract()[0].strip() u'Test text'
There are different objects available in the shell during the session, like response
and request
.
Or, you can instantiate an HtmlResponse
class and provide the HTML string in body
:
>>> from scrapy.http import HtmlResponse >>> response = HtmlResponse(url="my HTML string", body='<div id="test">Test text</div>', encoding='utf-8') >>> response.xpath('//div[@id="test"]/text()').extract()[0].strip() u'Test text'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With