Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

I'm not able to use python requests session cookies in selenium

I'm trying to open a requests session in a web browser it by the looks of it, it seems that using selenium is the most efficient/optimal way.

My code:

import requests
from selenium import webdriver
from time import sleep

s = requests.Session()
s.get('https://www.sotf.com/en/nike/man/footwear/nike--joyride--cc3--setter--sneakers--at6395.html?RwDet=true&articoli_ID=17911')

driver = webdriver.Safari()

driver.get("https://www.sotf.com/")

for cookie in s.cookies:
    driver.add_cookie({
        'name': cookie.name, 
        'value': cookie.value,
        'path': '/',
        'domain': cookie.domain,
    })

driver.refresh()
sleep(1000)

when printing s.cookies.get_dict() I get the following cookies:

{'__cfduid': 'dc81dd94c218523ce8161e4254d2652a01566815239', 'PHPSESSID': 'qhm7109shdrhu9uv3t38ani9df'}

The problem is that the browser doesn't isn't using these cookies, when checking the cookies inside of safari (using inspect element) __cfduid looks just like it should but for an unknown reason I see two PHPSESSID and the correct one has the Domain attribute set to .wwww.sotf.com instead of www.sotf.com:

enter image description here

Many thanks in advance.

like image 652
Nazim Kerimbekov Avatar asked Aug 26 '19 10:08

Nazim Kerimbekov


1 Answers

The PHPSESSID cookie is stored twice because you open the page twice - first time you open the page with empty cookie jar, whereas the server sets the first insecure PHPSESSID cookie, then you copy the second one from the requests.Session. Clear the cookies once you land on the host; in the example below, I navigate to https://www.sotf.com/404 as 404 pages are usually faster to load, clear the default cookies, then copy cookies from requests' cookie jar:

import contextlib
import requests
from selenium import webdriver
from time import sleep


@contextlib.contextmanager
def init_driver():
    d = webdriver.Chrome()
    yield d
    d.quit()


if __name__ == '__main__':
    headers = {
        'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3',
        'accept-encoding': 'gzip, deflate, br',
        'accept-language': 'en-US,en;q=0.9,de;q=0.8',
        'sec-fetch-mode': 'navigate',
        'sec-fetch-site': 'none',
        'upgrade-insecure-requests': '1',
        'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.132 Safari/537.36',
    }

    params = {
        'RwDet': 'true',
        'articoli_ID': '17911',
    }

    s = requests.Session()
    s.get('https://www.sotf.com/en/nike/man/footwear/nike--joyride--cc3--setter--sneakers--at6395.html', headers=headers, params=params)
    print('cookies in requests jar:')
    for c in s.cookies:
        print(c)


    with init_driver() as driver:
        # 404 pages are usually faster to load
        driver.get("https://www.sotf.com/404")
        driver.delete_all_cookies()

        for cookie in s.cookies:
            driver.add_cookie({
                'name': cookie.name,
                'value': cookie.value,
                'path': '/',
                'domain': cookie.domain,
            })

        driver.get("https://www.sotf.com/")
        print('cookies in selenium jar:')
        for c in driver.get_cookies():
            print(c)

Output:

cookies in requests jar:
<Cookie __cfduid=d54b8f9098af12dee16136e4dc641f74e1567012133 for .sotf.com/>
<Cookie PHPSESSID=mn28k5ta3ghfc77qb4nl23tga6 for www.sotf.com/>
cookies in selenium jar:
{'domain': 'www.sotf.com', 'expiry': 1598548157, 'httpOnly': False, 'name': 'cb-enabled', 'path': '/', 'secure': False, 'value': 'enabled'}
{'domain': 'www.sotf.com', 'httpOnly': False, 'name': 'PHPSESSID', 'path': '/', 'secure': True, 'value': 'mn28k5ta3ghfc77qb4nl23tga6'}
{'domain': 'sotf.com', 'httpOnly': False, 'name': '__cfduid', 'path': '/', 'secure': True, 'value': 'd54b8f9098af12dee16136e4dc641f74e1567012133'}
like image 122
hoefling Avatar answered Nov 01 '22 16:11

hoefling