Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there anything for Python that is like readability.js?

I'm looking for a package / module / function etc. that is approximately the Python equivalent of Arc90's readability.js

http://lab.arc90.com/experiments/readability

http://lab.arc90.com/experiments/readability/js/readability.js

so that I can give it some input.html and the result is cleaned up version of that html page's "main text". I want this so that I can use it on the server-side (unlike the JS version that runs only on browser side).

Any ideas?

PS: I have tried Rhino + env.js and that combination works but the performance is unacceptable it takes minutes to clean up most of the html content :( (still couldn't find why there is such a big performance difference).

like image 603
Emre Sevinç Avatar asked May 27 '10 12:05

Emre Sevinç


1 Answers

Please try my fork https://github.com/buriy/python-readability which is fast and has all features of latest javascript version.

like image 113
Yuri Baburov Avatar answered Sep 22 '22 01:09

Yuri Baburov