My spider.py file is as so:
def start_requests(self):
for url in self.start_urls:
yield scrapy.Request(
url,
self.parse,
headers={'My-Custom-Header':'Custom-Header-Content'},
meta={
'splash': {
'args': {
'html': 1,
'wait': 5,
},
}
},
)
And my parse def is as below:
def parse(self, response):
print(response.request.headers)
When I run my spider, below line gets printed as the header:
{
b'Content-Type': [b'application/json'],
b'Accept': [b'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8'],
b'Accept-Language': [b'en'],
b'User-Agent': [b'Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.2309.372 Safari/537.36'],
b'Accept-Encoding': [b'gzip,deflate']
}
AS you can see, this does not have the custom header I added to the Scrapy request.
Can anybody help me with adding a custom header values for this request?
Thanks in advance.
If you want splash to use your headers in the request to your specified url, then you should add the headers to the args
part, together with html
and wait
:
meta={
'splash': {
'args': {
'html': 1,
'wait': 5,
'headers': {
'My-Custom-Header': 'Custom-Header-Content',
},
},
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With