Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python library for rendering HTML and javascript [closed]

Is there any python module for rendering a HTML page with javascript and get back a DOM object?

I want to parse a page which generates almost all of its content using javascript.

like image 483
cnu Avatar asked Sep 24 '08 09:09

cnu


People also ask

Can we use Python library in JavaScript?

You can use Python and its modules inside JavaScript with Promise API. You can test it with your favorite python modules such as Numpy, Pandas, pyautogui etc at this point or other built in modules if you want.


2 Answers

The big complication here is emulating the full browser environment outside of a browser. You can use stand alone javascript interpreters like Rhino and SpiderMonkey to run javascript code but they don't provide a complete browser like environment to full render a web page.

If I needed to solve a problem like this I would first look at how the javascript is rendering the page, it's quite possible it's fetching data via AJAX and using that to render the page. I could then use python libraries like simplejson and httplib2 to directly fetch the data and use that, negating the need to access the DOM object. However, that's only one possible situation, I don't know the exact problem you are solving.

Other options include the selenium one mentioned by Łukasz, some kind of webkit embedded craziness, some kind of IE win32 scripting craziness or, finally, a pyxpcom based solution (with added craziness). All these have the drawback of requiring pretty much a fully running web browser for python to play with, which might not be an option depending on your environment.

like image 82
Michael Twomey Avatar answered Sep 22 '22 01:09

Michael Twomey


You can probably use python-webkit for it. Requires a running glib and GTK, but that's probably less problematic than wrapping the parts of webkit without glib.

I don't know if it does everything you need, but I guess you should give it a try.

like image 41
Armin Ronacher Avatar answered Sep 20 '22 01:09

Armin Ronacher