Scrapy capitalizes request headers

Question

I'm setting the headers following way

headers = {
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
    'cache-control': 'no-cache',
...
}

And calling request like that:

yield scrapy.Request(url='https:/myurl.com/', callback=self.parse, 
headers=headers, cookies=cookies, meta={'proxy': 'http://localhost:8888'})

And it makes that scrapy capitalizes all these headers and it looks like that (I'm using Charles proxy for debugging):

Accept: 
text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Cache-Control: no-cache

And this is not working correctly for my case.

If I'm using curl and set headers lowercase

accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
cache-control: no-cache

everything works like a charm.

Is there any way how I can disable this capitalizing behavior in Scrapy? Thanks for any help!

Done Data Solutions · Accepted Answer

This can't be done out of the box with Scrapy.

Reason: it is managing headers in a case insensitive way by design (see: https://github.com/scrapy/scrapy/blob/master/scrapy/http/headers.py). Guess they do it to avoid trouble with duplicate headers.

So most probably you'll have to do a fork and roll your own implementation of header handling or do at least some monkey patching.

But I'm wondering whether that is really what you need. I know that some websites do request header fingerprinting to detect bots, but the capitalized headers generated by scrapy look much more non-bot than the all-lowercase headers you want to generate for your requests.

Scrapy capitalizes request headers

Tags:

python

scrapy

kspi33

1 Answers

Done Data Solutions

Recent Activity

Donate For Us

Scrapy capitalizes request headers

Tags:

python

scrapy

kspi33

1 Answers

Done Data Solutions

Related questions

Recent Activity

Donate For Us