Question

Im working with the selenium remote driver to automate actions on a site, i can open the page i need directly by engineering the url as the sites url schema is very constant. This speeds up the script as it dose not have to work through several pages before it gets to the one it needs.

To make the automation seem organic is there a way to set a referral page in Selenium ?

Was it helpful?

Solution 2

What you are looking for is referer spoofing.

Selenium does not have an inbuilt method to do this, however it can be accomplished by using a proxy such as fiddler. Fiddler also provides an API-only version of the FiddlerCore component, and programmatic access to all of the proxy's settings and data, thus allowing you to modify the headers of the http response.

OTHER TIPS

If you're checking the referrer on the server, then using a proxy (as mentioned in other answers) will be the way to go.

However, if you need access to the referrer in Javascript using a proxy will not work. To set the Javascript referrer I did the following:

  1. Go to the referral website
  2. Inject this javascript onto the page via Selenium API: document.write('<script>window.location.href = "<my website>";</script>')"

I'm using a Python wrapper around selenium, so I cannot provide the function you need to inject the code in your language, but it should be easy to find.

Here is a solution in Python to do exactly that:

https://github.com/j-bennet/selenium-referer

I described the use case and the solution in the README. I think github repo won't go anywhere, but I'll quote the relevant pieces here just in case.

The solution uses libmproxy to implement a proxy server that only does one thing: adds a Referer header. Header is specified as command line parameter when running the proxy. Code:

# -*- coding: utf-8 -*-
"""
Proxy server to add a specified Referer: header to the request.
"""
from optparse import OptionParser
from libmproxy import controller, proxy
from libmproxy.proxy.server import ProxyServer


class RefererMaster(controller.Master):
    """
    Adds a specified referer header to the request.
    """

    def __init__(self, server, referer):
        """
        Init the proxy master.
        :param server: ProxyServer
        :param referer: string
        """
        controller.Master.__init__(self, server)
        self.referer = referer

    def run(self):
        """
        Basic run method.
        """
        try:
            print('Running...')
            return controller.Master.run(self)
        except KeyboardInterrupt:
            self.shutdown()

    def handle_request(self, flow):
        """
        Adds a Referer header.
        """
        flow.request.headers['referer'] = [self.referer]
        flow.reply()

    def handle_response(self, flow):
        """
        Does not do anything extra.
        """
        flow.reply()


def start_proxy_server(port, referer):
    """
    Start proxy server and return an instance.
    :param port: int
    :param referer: string
    :return: RefererMaster
    """
    config = proxy.ProxyConfig(port=port)
    server = ProxyServer(config)
    m = RefererMaster(server, referer)
    m.run()


if __name__ == '__main__':

    parser = OptionParser()

    parser.add_option("-r", "--referer", dest="referer",
                      help="Referer URL.")

    parser.add_option("-p", "--port", dest="port", type="int",
                      help="Port number (int) to run the server on.")

    popts, pargs = parser.parse_args()

    start_proxy_server(popts.port, popts.referer)

Then, in the setUp() method of the test, proxy server is started as an external process, using pexpect, and stopped in tearDown(). Method called proxy() returns proxy settings to configure Firefox driver with:

# -*- coding: utf-8 -*-
import os
import sys
import pexpect
import unittest

from selenium.webdriver.common.proxy import Proxy, ProxyType

import utils


class ProxyBase(unittest.TestCase):
    """
    We have to use our own proxy server to set a Referer header, because Selenium does not
    allow to interfere with request headers.

    This is the base class. Change `proxy_referer` to set different referers.
    """

    base_url = 'http://www.facebook.com'

    proxy_server = None

    proxy_address = '127.0.0.1'
    proxy_port = 8888
    proxy_referer = None
    proxy_command = '{0} {1} --referer {2} --port {3}'

    def setUp(self):
        """
        Create the environment.
        """
        print('\nSetting up.')
        self.start_proxy()
        self.driver = utils.create_driver(proxy=self.proxy())

    def tearDown(self):
        """
        Cleanup the environment.
        """
        print('\nTearing down.')
        utils.close_driver(self.driver)
        self.stop_proxy()

    def proxy(self):
        """
        Create proxy settings for our Firefox profile.
        :return: Proxy
        """
        proxy_url = '{0}:{1}'.format(self.proxy_address, self.proxy_port)

        p = Proxy({
            'proxyType': ProxyType.MANUAL,
            'httpProxy': proxy_url,
            'ftpProxy': proxy_url,
            'sslProxy': proxy_url,
            'noProxy': 'localhost, 127.0.0.1'
        })

        return p

    def start_proxy(self):
        """
        Start the proxy process.
        """
        if not self.proxy_referer:
            raise Exception('Set the proxy_referer in child class!')

        python_path = sys.executable
        current_dir = os.path.dirname(__file__)
        proxy_file = os.path.normpath(os.path.join(current_dir, 'referer_proxy.py'))

        command = self.proxy_command.format(
            python_path, proxy_file, self.proxy_referer, self.proxy_port)

        print('Running the proxy command:')
        print(command)

        self.proxy_server = pexpect.spawnu(command)
        self.proxy_server.expect_exact(u'Running...', 2)

    def stop_proxy(self):
        """
        Override in child class to use a proxy.
        """
        print('Stopping proxy server...')
        self.proxy_server.close(True)
        print('Proxy server stopped.')

I wanted my unit tests to start and stop the proxy server without any user interaction, and could not find any Python samples doing that. Which is why I created the github repo (link above).

Hope this helps someone.

Not sure if i understand your question correctly, but if you want to override your HTTP requests there is no way to do it directly with webdriver. You must run your request thru a proxy. I prefer using browsermob, you can get it thru maven or similar.

ProxyServer server = new ProxyServer(proxy_port); //net.lightbody.bmp.proxy.ProxyServer;
server.start();
server.setCaptureHeaders(true);
Proxy proxy = server.seleniumProxy(); //org.openqa.selenium.Proxy
proxy.setHttpProxy("localhost").setSslProxy("localhost");

server.addRequestInterceptor(new RequestInterceptor() {
     @Override
     public void process(BrowserMobHttpRequest browserMobHttpRequest, Har har) {
        browserMobHttpRequest.addRequestHeader("Referer", "blabla");
     }
});

// configure it as a desired capability
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(CapabilityType.PROXY, proxy);
// start the driver
driver = new FirefoxDriver(capabilities);

Or black/whitelist anything:

server.blacklistRequests("https?://.*\\.google-analytics\\.com/.*", 410);
server.whitelistRequests("https?://*.*.yoursite.com/.*. https://*.*.someOtherYourSite.*".split(","), 200);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top