Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selenium and BeautifulSoup: sharing and pulling session data resources to multiple libraries in python

I have problems comparing two libraries in Python 3.6. I use Selenium Firefox WebDriver to log into a website, but when I want BeautifulSoup or Requests to read that website, it reads the link, but differently (reads that page as if I have not logged in). How can I tell Requests that I have already logged in?

Below is the code I have written so far ---

from selenium import webdriver
import config
import requests
from bs4 import BeautifulSoup

#choose webdriver
browser=webdriver.Firefox(executable_path="C:\\Users\\myUser\\geckodriver.exe")
browser.get("https://www.mylink.com/")

#log in
timeout = 1
login = browser.find_element_by_name("sf-login")
login.send_keys(config.USERNAME)

password = browser.find_element_by_name("sf-password")
password.send_keys(config.PASSWORD)

button_log = browser.find_element_by_xpath("/html/body/div[2]/div[1]/div/section/div/div[2]/form/p[2]/input")
button_log.click()

name = "https://www.policytracker.com/auctions/page/"
browser.get(name)

name2 = "/html/body/div[2]/div[1]/div/section/div/div[2]/div[3]/div[" + str(N) + "]/a"

#next page loaded
title1 = browser.find_element_by_xpath(name2)
title1.click()
page = browser.current_url -------> this save url from website that i want to download content (i've already logged in that page)
r = requests.get(page) ---------> i want requests to go to this page, he goes, but not included logged in proceder.... WRONG
r.content
soup = BeautifulSoup(r.content, 'lxml')
print (soup)
like image 632
Rafał Szumski Avatar asked Sep 11 '25 13:09

Rafał Szumski


1 Answers

If you simply want to pass the page source to BeautifulSoup, you can get the page source from selenium and then pass it to BeautifulSoup directly (no need of requests module).

Instead of

page = browser.current_url
r = requests.get(page)
soup = BeautifulSoup(r.content, 'lxml')

you can do

page = browser.page_source
soup = BeautifulSoup(page, 'html.parser')
like image 90
Keyur Potdar Avatar answered Sep 13 '25 01:09

Keyur Potdar