Is it possible to get the request referrer from the response object in parse function?
10x
HTTP Referer
field is set up by HTTP client in request headers, not in response headers, as this header tells server where did client come from to current page.
It would be rather weird to receive http Referer
header in response.
But when talking about scrapy
, there's a reference to Request
object on which the Response
was generated, in response's request
field, so the next call result:
response.request.headers.get('Referer', None)
can contain Referer
header if it was set when making request.
The question above was asked a long time ago, and it has been answered well.
However, I thought I would add a different answer in case the answer by Rostyslav Dzinko does not apply/work in your case.
Let's say that you have 2 different parser methods:
If you cannot get the url (referer url) for the list of articles (list page) once you are in the parser_B, you can set headers field in parser_A, then send it to parser_B as the following example:
yield scrapy.Request(url=article_page_url, callback=self.parser_B, dont_filter=True, headers={'referer_url': list_page_url})
And, in parser_B method, you can do the following to obtain the list page's url:
print(response.request.headers.get('referer_url'))
Hope this helps those who needed help.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With