Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup does not work for some web sites

I have this sript:

import urrlib2
from bs4 import BeautifulSoup
url = "http://www.shoptop.ru/"
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page)
divs = soup.findAll('a')
print divs

For this web site, it prints empty list? What can be problem? I am running on Ubuntu 12.04

like image 675
torayeff Avatar asked Nov 18 '25 23:11

torayeff


1 Answers

Actually there are quite couple of bugs in BeautifulSoup which might raise some unknown errors. I had a similar issue when working on apache using lxml parser

So, just try to use other couple of parsers mentioned in the documentation

soup = BeautifulSoup(page, "html.parser")

This should work!

like image 132
Surya Avatar answered Nov 21 '25 13:11

Surya



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!