Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript (and HTML rendering) engine without a GUI for automation?

Are there any libraries or frameworks that provide the functionality of a browser, but do not need to actually render physically onto the screen?

I want to automate navigation on web pages (Mechanize does this, for example), but I want the full browser experience, including Javascript. Thus, I'd like to have a virtual browser of some sort, that I can use to "click on links" programmatically, have DOM elements and JS scripts render within it, and manipulate these elements.

Solution preferably in Python, but I can manage others.

like image 251
Oliver Zheng Avatar asked Jan 23 '23 04:01

Oliver Zheng


2 Answers

PhantomJS and PyPhantomJS are what I use for tasks like these.

What it is, is a headless WebKit based browser which is fully controllable via JavaScript. There's a C++ implementation (PhantomJS) and a Python one (PyPhantomJS). I prefer the Python one though, because it has a plugin system which allows you to add functionality to the core without actually modifying any code, unlike the C++ one. :)

like image 73
John Doe Avatar answered Jan 26 '23 17:01

John Doe


There is an absolute ton of free software technology now available: take your pick at http://wiki.python.org/moin/WebBrowserProgramming but if you have specific questions join pyjamas-dev on google groups and i'll be happy to give further details, there. brief answer: you can run pywebkitgtk "headless", or you can use xulrunner (via python-hulahop) again using pygtk without actually doing "browserwidget.show()", and there's also pykhtml. also you could use python COM to connect to MSHTML.DLL.

these are all "cheat" methods: using python bindings to a graphical web browser engine without actually firing up the graphical bit. if you really wanted to put some serious hard-core programming in, you could create a "port" of webkit which was not connected to a GUI toolkit: as an experienced webkit programmer i'd put it as around... 2 weeks of full-time effort to make such a "headless" version of webkit.

l.

like image 43
user362834 Avatar answered Jan 26 '23 17:01

user362834