I'm doing web-crawling with Selenium and I want to get an element(such as a link) written by JavaScript after Selenium simulating clicking on a fake link.
I tried get_html_source(), but it doesn't include the content written by JavaScript.
Code I've written:
def test_comment_url_fetch(self):
sel = self.selenium
sel.open("/rmrb")
url = sel.get_location()
#print url
if url.startswith('http://login'):
sel.open("/rmrb")
i = 1
while True:
try:
if i == 1:
sel.click("//div[@class='WB_feed_type SW_fun S_line2']/div/div/div[3]/div/a[4]")
print "click"
else:
XPath = "//div[@class='WB_feed_type SW_fun S_line2'][%d]/div/div/div[3]/div/a[4]"%i
sel.click(XPath)
print "click"
except Exception, e:
print e
break
i += 1
html = sel.get_html_source()
html_file = open("tmp\\foo.html", 'w')
html_file.write(html.encode('utf-8'))
html_file.close()
I use a while-loop to click a series of fake links which trigger js-actions to show extra content, and that content is what I want. But sel.get_html_source() didn't give what I want.
Anybody may help? Thanks a lot.
Since I usually do post-processing on the fetched nodes I run JavaScript directly in the browser with execute_script
. For example to get all a-tags:
js_code = "return document.getElementsByTagName('a')"
your_elements = sel.execute_script(js_code)
Edit: execute_script
and get_eval
are equivalent except that get_eval
performs an implicit return, in execute_script
it has to be stated explicitly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With