Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selenium driver's page source different than browser

i unfortunately am not able to post code to reproduce this problem, since it involves signing into a site that is not a public site. but my question is more general than code problems. essentially, driver.page_source does not match what shows up in the browser it is driving. this is not an issue with elements not loading fully because i am testing this while executing code line by line in a python terminal. i am looking at the page source in the browser after right clicking and going to "view page source", and but if i print driver.page_source or attempt to find_element_by_[...], it shows slightly different code with entire elements missing. here is the html in question:

<nav role="navigation" class="utility-nav__wrapper--right">
<input id="hdn_partyId" value="1965629" type="hidden">
<input id="hdn_firstName" value="CHARLES" type="hidden">
<input id="hdn_sessionId" value="uHxQhlARvzA7N16uh+KJAdNFIcY6D8f9ornqoPQ" type="hidden">
<input id="hdn_cmsAlertRequest" type="hidden" value="Biennial Plus">
<ul class="h-list h-list--middle">
    [...]
</ul>

i need all 4 of the input elements, however, hdn_partyId and hdn_sessionId elements do not appear in selenium's .page_source and if i try to get them with .find_element_by_[...] i get a NoSuchElementException

i even ran a check on finding all input elements and listing them, and these 2 do not show up.

does anyone have any idea why selenium would not provide the same content as directly looking at the browser it is driving?

EDIT: to clarify... i am driving Chrome with Chromedriver through Selenium. this is not an issue with the page not fully loading. as i mentioned, i am running this manually line by line through a python terminal and not executing a script. so the browser pops up, loads the page, logs in, and then i manually check the browser's page source and see the element, then i print driver.page_source and it's not there, and if i run session_id = driver.find_element_by_id('hdn_sessionId') i get a NoSuchElementException. there are also no frames at all in the page, nor any additional windows.

like image 420
crookedleaf Avatar asked Jul 21 '17 20:07

crookedleaf


2 Answers

A coworker of mine has figured out the issue and a workaround. Essentially, after the page is done loading, it runs a javascript command that cleans up the DOM. What the "view page source" in the browser shows is not what the current state is. So running print driver.page_source or using any form of driver.find_element_by_[...] is pulling from the newest and freshest page data, while the browser's "view page source" only shows what was provided when the page first loaded. If you start 'inspecting' the page in Chrome, you will see the HTML is different than what the browser says the "page source" is. After reverse engineering the Javascript, we are able to run partyid = driver.execute_script('return accountdata.$partyId.val();') and get what was originally assigned. I hope this is enough info to help other people who may run into this issue in the future.

like image 156
crookedleaf Avatar answered Oct 20 '22 11:10

crookedleaf


try like this you will get source code keyword "view-source:" which can be different according to your browser this is for the chrome

driver.get("view-source:"+url)

sourcecode=driver.find_element_by_tag_name('body').text
like image 4
yash shah Avatar answered Oct 20 '22 10:10

yash shah