I'm writing some tests with Selenium and noticed, that Referer
is missing from the headers. I wrote the following minimal example to test this with https://httpbin.org/headers:
import selenium.webdriver
options = selenium.webdriver.FirefoxOptions()
options.add_argument('--headless')
profile = selenium.webdriver.FirefoxProfile()
profile.set_preference('devtools.jsonview.enabled', False)
driver = selenium.webdriver.Firefox(firefox_options=options, firefox_profile=profile)
wait = selenium.webdriver.support.ui.WebDriverWait(driver, 10)
driver.get('http://www.python.org')
assert 'Python' in driver.title
url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
wait.until(lambda driver: driver.current_url == url)
print(driver.page_source)
driver.close()
Which prints:
<html><head><link rel="alternate stylesheet" type="text/css" href="resource://content-accessible/plaintext.css" title="Wrap Long Lines"></head><body><pre>{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.5",
"Connection": "close",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:64.0) Gecko/20100101 Firefox/64.0"
}
}
</pre></body></html>
So there is no Referer
. However, if I browse to any page and manually execute
window.location.href = "https://httpbin.org/headers"
in the Firefox console, Referer
does appear as expected.
As pointed out in the comments below, when using
driver.get("javascript: window.location.href = '{}'".format(url))
instead of
driver.execute_script("window.location.href = '{}';".format(url))
the request does include Referer
. Also, when using Chrome instead of Firefox, both methods include Referer
.
So the main question still stands: Why is Referer
missing in the request when sent with Firefox as described above?
You cannot set Referer header manually but you can use location. href to set the referer header to the link used in href but it will cause reloading of the page.
The Referer header allows a server to identify a page where people are visiting it from. This data can be used for analytics, logging, optimized caching, and more. When you follow a link, the Referer contains the address of the page that owns the link.
To check the Referer in action go to Inspect Element -> Network check the request header for Referer like below. Referer header is highlighted. Supported Browsers: The browsers are compatible with HTTP header Referer are listed below: Google Chrome.
What: the Referer (a misspelling of referrer) header contains the address of the previous web page from which a link to the currently requested page was followed. In more simple terms, the referer is the URL from which came a request received by a server.
Referer
as per the MDN documentationThe
Referer
request header contains the address of the previous web page from which a link to the currently requested page was followed. TheReferer
header allows servers to identify where people are visiting them from and may use that data for analytics, logging, or optimized caching, for example.Important: Although this header has many innocent uses it can have undesirable consequences for user security and privacy.
Source: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer
However:
A Referer header is not sent by browsers if:
- The referring resource is a local "file" or "data" URI.
- An unsecured HTTP request is used and the referring page was received with a secure protocol (HTTPS).
Source: https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referer
There are some privacy and security risks associated with the Referer
HTTP header:
The
Referer
header contains the address of the previous web page from which a link to the currently requested page was followed, which can be further used for analytics, logging, or optimized caching.
Source: https://developer.mozilla.org/en-US/docs/Web/Security/Referer_header:_privacy_and_security_concerns#The_referrer_problem
From the Referer
header perspective majority of security risks can be mitigated following the steps:
Referrer-Policy
: Using theReferrer-Policy
header on your server to control what information is sent through the Referer header. Again, a directive of no-referrer would omit the Referer header entirely.- The
referrerpolicy
attribute on HTML elements that are in danger of leaking such information (such as<img>
and<a>
). This can for example be set tono-referrer
to stop theReferer
header being sent altogether.- The
rel
attribute set tonoreferrer
on HTML elements that are in danger of leaking such information (such as<img>
and<a>
).- The Exit Page Redirect technique: This is the only method that should work at the moment without flaw is to have an exit page that you don’t mind having inside of the
referer
header. Many websites implement this method, including Google and Facebook. Instead of having the referrer data show private information, it only shows the website that the user came from, if implemented correctly. Instead of the referrer data appearing ashttp://example.com/user/foobar
the new referrer data will appear ashttp://example.com/exit?url=http%3A%2F%2Fexample.com
. The way the method works is by having all external links on your website go to a intermediary page that then redirects to the final page. Below we have a link to the websiteexample.com
and we URL encode the full URL and add it to theurl
parameter of our exit page.
Sources:
I have executed your code through both through GeckoDriver/Firefox and ChromeDriver/Chrome combination:
driver.get('http://www.python.org')
assert 'Python' in driver.title
url = 'https://httpbin.org/headers'
driver.execute_script('window.location.href = "{}";'.format(url))
WebDriverWait(driver, 10).until(lambda driver: driver.current_url == url)
print(driver.page_source)
Using GeckoDriver/Firefox Referer: "https://www.python.org/"
header was missing as follows:
{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.5",
"Host": "httpbin.org",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64; rv:67.0) Gecko/20100101 Firefox/67.0"
}
}
Using ChromeDriver/Chrome Referer: "https://www.python.org/"
header was present as follows:
{
"headers": {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
"Host": "httpbin.org",
"Referer": "https://www.python.org/",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.80 Safari/537.36"
}
}
It seems to be an issue with GeckoDriver/Firefox in handling the Referer
header.
Referrer Policy
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With