<p>I run a query in one web page, then I get result url. If I right click see html source, I can see the html code generated by JS. If I simply use urllib, python cannot get the JS code. So I see some solution using selenium. Here's my code:</p> <pre class="prettyprint"><code>from selenium import webdriver url = 'http://www.archives.com/member/Default.aspx?_act=VitalSearchResult&lastName=Smith&state=UT&country=US&deathYear=2004&deathYearSpan=10&location=UT&activityID=9b79d578-b2a7-4665-9021-b104999cf031&RecordType=2' driver = webdriver.PhantomJS(executable_path='C:\python27\scripts\phantomjs.exe') driver.get(url) print driver.page_source >>> <html><head></head><body></body></html> Obviously It's not right!! </code></pre> <p>Here's the source code I need in right click windows, (I want the INFORMATION part)</p> <pre class="prettyprint"><code></script></div><div class="searchColRight"><div id="topActions" class="clearfix noPrint"><div id="breadcrumbs" class="left"><a title="Results Summary" href="Default.aspx? _act=VitalSearchR ...... <<INFORMATION I NEED>> ... to view the entire record.</p></div><script xmlns:msxsl="urn:schemas-microsoft-com:xslt"> jQuery(document).ready(function() { jQuery(".ancestry-information-tooltip").actooltip({ href: "#AncestryInformationTooltip", orientation: "bottomleft"}); }); </code></pre> <p>So my question is: How to get the information generated by JS?</p>

<p>It's not necessary to use that workaround, you can use instead:</p> <pre class="prettyprint"><code>driver = webdriver.PhantomJS() driver.get('http://www.google.com/') html = driver.find_element_by_tag_name('html').get_attribute('innerHTML') </code></pre>

How to get html with javascript rendered sourcecode by using selenium

Tags:

python

javascript

selenium

I run a query in one web page, then I get result url. If I right click see html source, I can see the html code generated by JS. If I simply use urllib, python cannot get the JS code. So I see some solution using selenium. Here's my code:

from selenium import webdriver
url = 'http://www.archives.com/member/Default.aspx?_act=VitalSearchResult&lastName=Smith&state=UT&country=US&deathYear=2004&deathYearSpan=10&location=UT&activityID=9b79d578-b2a7-4665-9021-b104999cf031&RecordType=2'
driver = webdriver.PhantomJS(executable_path='C:\python27\scripts\phantomjs.exe')
driver.get(url)
print driver.page_source

>>> <html><head></head><body></body></html>         Obviously It's not right!!

Here's the source code I need in right click windows, (I want the INFORMATION part)

</script></div><div class="searchColRight"><div id="topActions" class="clearfix 
noPrint"><div id="breadcrumbs" class="left"><a title="Results Summary"
href="Default.aspx?    _act=VitalSearchR ...... <<INFORMATION I NEED>> ... 
to view the entire record.</p></div><script xmlns:msxsl="urn:schemas-microsoft-com:xslt">

        jQuery(document).ready(function() {
            jQuery(".ancestry-information-tooltip").actooltip({
href: "#AncestryInformationTooltip", orientation: "bottomleft"});
        });

So my question is: How to get the information generated by JS?

771

asked Mar 30 '14 02:03

MacSanhe

2 Answers

You will need to get get the document via javascript you can use seleniums execute_script function

from time import sleep # this should go at the top of the file

sleep(5)
html = driver.execute_script("return document.getElementsByTagName('html')[0].innerHTML")
print html

That will get everything inside of the <html> tag

158

answered Oct 24 '22 18:10

Victory

It's not necessary to use that workaround, you can use instead:

driver = webdriver.PhantomJS()
driver.get('http://www.google.com/')
html = driver.find_element_by_tag_name('html').get_attribute('innerHTML')

answered Oct 24 '22 18:10

Darius

Related questions
                            
                                Proper way to call superclass functions from subclass
                            
                                Facebook JS SDK's FB.api('/me') method doesn't return the fields I expect in Graph API v2.4+
                            
                                How do I clear location.state in react-router on page reload?
                            
                                How to save JS Date.now() in PostgreSQL?
                            
                                Display an image from url in ReactJS
                            
                                Javascript Best Practices [closed]
                            
                                How to get Javascript function's source code from specific site using firebug?
                            
                                jquery ui - how to use google CDN [duplicate]
                            
                                One button firing another buttons click event
                            
                                How do I find what Javascript is running on certain events?
                            
                                How can I easily duplicate the trello style of drag and drop of cards? (Kanban style app) [closed]
                            
                                How to log JavaScript objects and arrays in winston as console.log does?
                            
                                How to fix Duplicate Facebook pixel ID error?
                            
                                How to login in Puppeteer?
                            
                                Ajax post to ASP.net MVC controller - object properties are null
                            
                                How do I parse an ISO 8601 formatted duration using moment.js?
                            
                                Angular2 .gitignore
                            
                                How does JavaScript VM implements Object property access? Is it Hashtable?
                            
                                Get/set file encoding with javascript's FileReader
                            
                                AngularJS global http polling service

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With