Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Cloudflare differentiate Selenium and Requests traffic?

Context

I am currently attempting to build a small-scale bot using Selenium and Requests module in Python.
However, the webpage I want to interact with is running behind Cloudflare.
My python script is running over Tor using stem module.
My traffic analysis is based on Firefox's "Developer options->Network" using Persist Logs.

My findings so far:

  • Selenium's Firefox webdriver can often access the webpage without going through "checking browser page" (return code 503) and "captcha page" (return code 403).
  • Requests session object with the same user agent always results in "captcha page" (return code 403).

If Cloudflare was checking my Javascript functionality, shouldn't my requests module return 503 ?

Code Example

driver = webdriver.Firefox(firefox_profile=fp, options=fOptions)
driver.get("https://www.cloudflare.com")   # usually returns code 200 without verifying the browser

session = requests.Session()
# ... applied socks5 proxy for both http and https ... #
session.headers.update({"user-agent": driver.execute_script("return navigator.userAgent;")})
page = session.get("https://www.cloudflare.com")
print(page.status_code) # return code 403
print(page.text)        # returns "captcha page"

Both Selenium and Requests modules are using the same user agent and ip.
Both are using GET without any parameters.
How does Cloudflare distinguish these traffic?
Am I missing something?


I tried to transfer cookies from the webdriver to the requests session to see if a bypass is possible but had no luck.
Here is the used code:

for c in driver.get_cookies():
    session.cookies.set(c['name'], c['value'], domain=c['domain'])
like image 614
ku8zi Avatar asked Apr 10 '26 07:04

ku8zi


2 Answers

There are additional JavaScript APIs exposed to the webpage when using Selenium. If you can disable them, you may be able to fix the problem.

like image 159
9pfs supports Ukraine Avatar answered Apr 12 '26 19:04

9pfs supports Ukraine


Cloudflare doesn't only check HTTP headers or javascript — it also analyses the TLS header. I'm not sure exactly how it does it, but I've found that it can be circumvented by using NSS instead of OpenSSL (though it's not well integrated into Requests).

like image 45
Kyuuhachi Avatar answered Apr 12 '26 20:04

Kyuuhachi



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!