I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using Chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal Chrome browser.
When I browse to these sites in normal Chrome everything works fine, but the moment I use Selenium I'm detected.
In theory, chromedriver and Chrome should look literally exactly the same to any webserver, but somehow they can detect it.
If you want some test code try out this:
from pyvirtualdisplay import Display from selenium import webdriver display = Display(visible=1, size=(1600, 902)) display.start() chrome_options = webdriver.ChromeOptions() chrome_options.add_argument('--disable-extensions') chrome_options.add_argument('--profile-directory=Default') chrome_options.add_argument("--incognito") chrome_options.add_argument("--disable-plugins-discovery"); chrome_options.add_argument("--start-maximized") driver = webdriver.Chrome(chrome_options=chrome_options) driver.delete_all_cookies() driver.set_window_size(800,800) driver.set_window_position(0,0) print 'arguments done' driver.get('http://stubhub.com')
If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.
How do they do it?
I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal Firefox browser with only the additional plugin.
When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser's' requests often have 'no-cache' in the response header.
Results like this Is there a way to detect that I'm in a Selenium Webdriver page from JavaScript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.
The site uploads a fingerprint to their servers, but I checked and the fingerprint of Selenium is identical to the fingerprint when using Chrome.
This is one of the fingerprint payloads that they send to their servers:
{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en- US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins": {"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionMo dule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes": {"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave- flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContent DecryptionModuleapplication/x-ppapi-widevine- cdm","4":"NativeClientExecutableapplication/x- nacl","5":"PortableNativeClientExecutableapplication/x- pnacl","6":"PortableDocumentFormatapplication/x-google-chrome- pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts": {"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"Trebu chetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationM ono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}
It's identical in Selenium and in Chrome.
VPNs work for a single use, but they get detected after I load the first page. Clearly some JavaScript is being run to detect Selenium.
The answer is YES! Websites can detect the automation using JavaScript experimental technology navigator. webdriver in the navigator interface. If the website is loaded with automation tools like Selenium, the value of navigator.
Your question is, "Can a website detect when you are using selenium with geckodriver?" The answer is yes, Basically the way the selenium detection works, is that they test for pre-defined javascript variables which appear when running with selenium.
Basically, the way the Selenium detection works, is that they test for predefined JavaScript variables which appear when running with Selenium. The bot detection scripts usually look anything containing word "selenium" / "webdriver" in any of the variables (on window object), and also document variables called $cdc_
and $wdc_
. Of course, all of this depends on which browser you are on. All the different browsers expose different things.
For me, I used Chrome, so, all that I had to do was to ensure that $cdc_
didn't exist anymore as a document variable, and voilà (download chromedriver source code, modify chromedriver and re-compile $cdc_
under different name.)
This is the function I modified in chromedriver:
function getPageCache(opt_doc) { var doc = opt_doc || document; //var key = '$cdc_asdjflasutopfhvcZLmcfl_'; var key = 'randomblabla_'; if (!(key in doc)) doc[key] = new Cache(); return doc[key]; }
(Note the comment. All I did I turned $cdc_
to randomblabla_
.)
Here is pseudocode which demonstrates some of the techniques that bot networks might use:
runBotDetection = function () { var documentDetectionKeys = [ "__webdriver_evaluate", "__selenium_evaluate", "__webdriver_script_function", "__webdriver_script_func", "__webdriver_script_fn", "__fxdriver_evaluate", "__driver_unwrapped", "__webdriver_unwrapped", "__driver_evaluate", "__selenium_unwrapped", "__fxdriver_unwrapped", ]; var windowDetectionKeys = [ "_phantom", "__nightmare", "_selenium", "callPhantom", "callSelenium", "_Selenium_IDE_Recorder", ]; for (const windowDetectionKey in windowDetectionKeys) { const windowDetectionKeyValue = windowDetectionKeys[windowDetectionKey]; if (window[windowDetectionKeyValue]) { return true; } }; for (const documentDetectionKey in documentDetectionKeys) { const documentDetectionKeyValue = documentDetectionKeys[documentDetectionKey]; if (window['document'][documentDetectionKeyValue]) { return true; } }; for (const documentKey in window['document']) { if (documentKey.match(/\$[a-z]dc_/) && window['document'][documentKey]['cache_']) { return true; } } if (window['external'] && window['external'].toString() && (window['external'].toString()['indexOf']('Sequentum') != -1)) return true; if (window['document']['documentElement']['getAttribute']('selenium')) return true; if (window['document']['documentElement']['getAttribute']('webdriver')) return true; if (window['document']['documentElement']['getAttribute']('driver')) return true; return false; };
According to user szx, it is also possible to simply open chromedriver.exe in a hex editor, and just do the replacement manually, without actually doing any compiling.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With