python BeautifulSoup searching a tag

Tags:

beautifulsoup

My first post here, I'm trying to find all tags in this specific html and i can't get them out, this is the code:

from bs4 import BeautifulSoup
from urllib import urlopen

url = "http://www.jutarnji.hr"
html_doc = urlopen(url).read()
soup = BeautifulSoup(html_doc)
soup.prettify()
soup.find_all("a", {"class":"black"})

find function returns [], but i see that there are tags with class:"black" in the html, do I miss something?

Thanks, Vedran

833

asked Mar 30 '12 17:03

2 Answers

I also had same problem.

Try

soup.findAll("a",{"class":"black"})

instead of

soup.find_all("a",{"class":"black"})

soup.findAll() works well for me.

112

answered Sep 22 '22 04:09

The problem here is that the website's class tags arent separated from the end of the href attribute value with a space. BeautifulSoup doesnt seem to handle this very well. A reproducable test case is the following

>>> BeautifulSoup.BeautifulSoup('<a href="http://www.jutarnji.hr/crkva-se-ogradila-od--cjenika--don-mikica--osim-krizme--sve-druge-financijske-obveze-su-neprihvatljive/1018314/" class="black">').prettify()
'<a href="http://www.jutarnji.hr/crkva-se-ogradila-od--cjenika--don-mikica--osim-krizme--sve-druge-financijske-obveze-su-neprihvatljive/1018314/" class="black">\n</a>'
>>> BeautifulSoup.BeautifulSoup('<a href="http://www.jutarnji.hr/crkva-se-ogradila-od--cjenika--don-mikica--osim-krizme--sve-druge-financijske-obveze-su-neprihvatljive/1018314/"class="black">').prettify()
''

answered Sep 19 '22 04:09

Puneet

Related questions
                            
                                Using my own corpus for category classification in Python NLTK
                            
                                Empty cookiejar using SUDS
                            
                                Writing a generalized function for both strings and lists in python
                            
                                boto.s3: copy() on a key object loses 'Content-Type' metadata
                            
                                Monkey-patch a builtin function for a unit-test?
                            
                                Django Testing: no data in temporary database file
                            
                                internal reference prevents garbage collection
                            
                                Converting ndarray generated by hcluster into a Newick string for use with ete2 package
                            
                                Generate Zip Files and Store in GAE BlobStore
                            
                                Pickling a graph with cycles
                            
                                Recommended Python Modules for Function Argument Handling?
                            
                                OpenCV Python Bindings for GrabCut Algorithm
                            
                                SWIG C++ Python polymorphism and multi-threading
                            
                                Use blocks from included files for parent in jinja2
                            
                                py2app picking up .git subdir of a package during build
                            
                                Can't get scipy hierarchical clustering to work
                            
                                Can I install the "scraperwiki" library locally?
                            
                                How to convert 3 lists into 1 3D Numpy array
                            
                                The value of an empty list in function parameter, example here [duplicate]
                            
                                Need algorithm suggestions for flight routings

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

python BeautifulSoup searching a tag

Tags:

python

beautifulsoup

onoxo

People also ask

2 Answers

Froyo

Puneet

Recent Activity

Donate For Us