i unfortunately am not able to post code to reproduce this problem, since it involves signing into a site that is not a public site. but my question is more general than code problems. essentially, driver.page_source
does not match what shows up in the browser it is driving. this is not an issue with elements not loading fully because i am testing this while executing code line by line in a python terminal. i am looking at the page source in the browser after right clicking and going to "view page source", and but if i print driver.page_source
or attempt to find_element_by_[...]
, it shows slightly different code with entire elements missing. here is the html in question:
<nav role="navigation" class="utility-nav__wrapper--right">
<input id="hdn_partyId" value="1965629" type="hidden">
<input id="hdn_firstName" value="CHARLES" type="hidden">
<input id="hdn_sessionId" value="uHxQhlARvzA7N16uh+KJAdNFIcY6D8f9ornqoPQ" type="hidden">
<input id="hdn_cmsAlertRequest" type="hidden" value="Biennial Plus">
<ul class="h-list h-list--middle">
[...]
</ul>
i need all 4 of the input elements, however, hdn_partyId
and hdn_sessionId
elements do not appear in selenium's .page_source
and if i try to get them with .find_element_by_[...]
i get a NoSuchElementException
i even ran a check on finding all input
elements and listing them, and these 2 do not show up.
does anyone have any idea why selenium would not provide the same content as directly looking at the browser it is driving?
EDIT: to clarify... i am driving Chrome with Chromedriver through Selenium. this is not an issue with the page not fully loading. as i mentioned, i am running this manually line by line through a python terminal and not executing a script. so the browser pops up, loads the page, logs in, and then i manually check the browser's page source and see the element, then i print driver.page_source
and it's not there, and if i run session_id = driver.find_element_by_id('hdn_sessionId')
i get a NoSuchElementException
. there are also no frames at all in the page, nor any additional windows.
A coworker of mine has figured out the issue and a workaround. Essentially, after the page is done loading, it runs a javascript command that cleans up the DOM. What the "view page source" in the browser shows is not what the current state is. So running print driver.page_source
or using any form of driver.find_element_by_[...]
is pulling from the newest and freshest page data, while the browser's "view page source" only shows what was provided when the page first loaded. If you start 'inspecting' the page in Chrome, you will see the HTML is different than what the browser says the "page source" is. After reverse engineering the Javascript, we are able to run partyid = driver.execute_script('return accountdata.$partyId.val();')
and get what was originally assigned. I hope this is enough info to help other people who may run into this issue in the future.
try like this you will get source code keyword "view-source:" which can be different according to your browser this is for the chrome
driver.get("view-source:"+url)
sourcecode=driver.find_element_by_tag_name('body').text
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With