Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 web scraping options

I'm new to Python so I'm sorry if this is a newbie question.

I'm trying to build a program involving webscraping and I've noticed that Python 3 seems to have significantly fewer web-scraping modules than the Python 2.x series.

Beautiful Soup, mechanize, and scrapy -- the three modules recommended to me -- all seem to be incompatible.

I'm wondering if anyone on this forum has a good option for webscraping using python 3.

Any suggestions would be greatly appreciated.

Thanks, Will

like image 302
Will Fogel Avatar asked Oct 25 '22 03:10

Will Fogel


1 Answers

lxml.html works on Python 3, and gets you html parsing, at least.

BeautifulSoup 4, which is in the works, should support Python 3 (I've done some work on this).

like image 109
Thomas K Avatar answered Oct 31 '22 09:10

Thomas K