I am trying to write a Python-based Web Bot that can read and interpret an HTML page, then execute an onClick function and receive the resulting new HTML page. I can already read the HTML page and I can determine the functions to be called by the onClick command, but I have no idea how to execute those functions or how to receive the resulting HTML code.
Any ideas?
The only tool in Python for Javascript, that I am aware of is python-spidermonkey. I have never used it though.
With Jython you could (ab-)use HttpUnit.
Edit: forgot that you can use Scrapy. It supports Javascript through Spidermonkey, and you can even use Firefox for crawling the web.
Edit 2: Recently, I find myself using browser automation more and more for such tasks thanks to some excellent libraries. QtWebKit offers full access to a WebKit browser, which can be used in Python thanks to language bindings (PySide or PyQt). There seem to be similar libraries and bindings for Gtk+ which I haven't tried. Selenium WebDriver API also works great and has an active community.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With