Im working with the selenium remote driver to automate actions on a site, i can open the page i need directly by engineering the url as the sites url schema is very constant. This speeds up the script as it dose not have to work through several pages before it gets to the one it needs.
To make the automation seem organic is there a way to set a referral page in Selenium ?
If you're checking the referrer on the server, then using a proxy (as mentioned in other answers) will be the way to go.
However, if you need access to the referrer in Javascript using a proxy will not work. To set the Javascript referrer I did the following:
document.write('<script>window.location.href = "<my website>";</script>')"
I'm using a Python wrapper around selenium, so I cannot provide the function you need to inject the code in your language, but it should be easy to find.
What you are looking for is referer spoofing.
Selenium does not have an inbuilt method to do this, however it can be accomplished by using a proxy such as fiddler. Fiddler also provides an API-only version of the FiddlerCore component, and programmatic access to all of the proxy's settings and data, thus allowing you to modify the headers of the http response.
Here is a solution in Python to do exactly that:
https://github.com/j-bennet/selenium-referer
I described the use case and the solution in the README. I think github repo won't go anywhere, but I'll quote the relevant pieces here just in case.
The solution uses libmproxy to implement a proxy server that only does one thing: adds a Referer header. Header is specified as command line parameter when running the proxy. Code:
# -*- coding: utf-8 -*-
"""
Proxy server to add a specified Referer: header to the request.
"""
from optparse import OptionParser
from libmproxy import controller, proxy
from libmproxy.proxy.server import ProxyServer
class RefererMaster(controller.Master):
"""
Adds a specified referer header to the request.
"""
def __init__(self, server, referer):
"""
Init the proxy master.
:param server: ProxyServer
:param referer: string
"""
controller.Master.__init__(self, server)
self.referer = referer
def run(self):
"""
Basic run method.
"""
try:
print('Running...')
return controller.Master.run(self)
except KeyboardInterrupt:
self.shutdown()
def handle_request(self, flow):
"""
Adds a Referer header.
"""
flow.request.headers['referer'] = [self.referer]
flow.reply()
def handle_response(self, flow):
"""
Does not do anything extra.
"""
flow.reply()
def start_proxy_server(port, referer):
"""
Start proxy server and return an instance.
:param port: int
:param referer: string
:return: RefererMaster
"""
config = proxy.ProxyConfig(port=port)
server = ProxyServer(config)
m = RefererMaster(server, referer)
m.run()
if __name__ == '__main__':
parser = OptionParser()
parser.add_option("-r", "--referer", dest="referer",
help="Referer URL.")
parser.add_option("-p", "--port", dest="port", type="int",
help="Port number (int) to run the server on.")
popts, pargs = parser.parse_args()
start_proxy_server(popts.port, popts.referer)
Then, in the setUp() method of the test, proxy server is started as an external process, using pexpect, and stopped in tearDown(). Method called proxy() returns proxy settings to configure Firefox driver with:
# -*- coding: utf-8 -*-
import os
import sys
import pexpect
import unittest
from selenium.webdriver.common.proxy import Proxy, ProxyType
import utils
class ProxyBase(unittest.TestCase):
"""
We have to use our own proxy server to set a Referer header, because Selenium does not
allow to interfere with request headers.
This is the base class. Change `proxy_referer` to set different referers.
"""
base_url = 'http://www.facebook.com'
proxy_server = None
proxy_address = '127.0.0.1'
proxy_port = 8888
proxy_referer = None
proxy_command = '{0} {1} --referer {2} --port {3}'
def setUp(self):
"""
Create the environment.
"""
print('\nSetting up.')
self.start_proxy()
self.driver = utils.create_driver(proxy=self.proxy())
def tearDown(self):
"""
Cleanup the environment.
"""
print('\nTearing down.')
utils.close_driver(self.driver)
self.stop_proxy()
def proxy(self):
"""
Create proxy settings for our Firefox profile.
:return: Proxy
"""
proxy_url = '{0}:{1}'.format(self.proxy_address, self.proxy_port)
p = Proxy({
'proxyType': ProxyType.MANUAL,
'httpProxy': proxy_url,
'ftpProxy': proxy_url,
'sslProxy': proxy_url,
'noProxy': 'localhost, 127.0.0.1'
})
return p
def start_proxy(self):
"""
Start the proxy process.
"""
if not self.proxy_referer:
raise Exception('Set the proxy_referer in child class!')
python_path = sys.executable
current_dir = os.path.dirname(__file__)
proxy_file = os.path.normpath(os.path.join(current_dir, 'referer_proxy.py'))
command = self.proxy_command.format(
python_path, proxy_file, self.proxy_referer, self.proxy_port)
print('Running the proxy command:')
print(command)
self.proxy_server = pexpect.spawnu(command)
self.proxy_server.expect_exact(u'Running...', 2)
def stop_proxy(self):
"""
Override in child class to use a proxy.
"""
print('Stopping proxy server...')
self.proxy_server.close(True)
print('Proxy server stopped.')
I wanted my unit tests to start and stop the proxy server without any user interaction, and could not find any Python samples doing that. Which is why I created the github repo (link above).
Hope this helps someone.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With