I would like to parse an HTML file with Python, and the module I am using is BeautifulSoup. It is said that the function <code>find_all</code> is the same as <code>findAll</code>. I've tried both of them, but I believe they are different: <pre class="prettyprint"><code>import urllib, urllib2, cookielib from BeautifulSoup import * site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407" rqstr = urllib2.Request(site) rq = urllib2.urlopen(rqstr) fchData = rq.read() soup = BeautifulSoup(fchData) t = soup.findAll('tr') </code></pre> Can anyone tell me the difference?

In BeautifulSoup version 4, the methods are exactly the same; the mixed-case versions (<code>findAll</code>, <code>findAllNext</code>, <code>nextSibling</code>, etc.) have all been renamed to conform to the Python style guide, but the old names are still available to make porting easier. See Method Names for a full list. In new code, you should use the lowercase versions, so <code>find_all</code>, etc. In your example however, you are using BeautifulSoup version 3 (discontinued since March 2012, don't use it if you can help it), where only <code>findAll()</code> is available. Unknown attribute names (such as <code>.find_all</code>, which only is available in BeautifulSoup 4) are treated as if you are searching for a tag by that name. There is no <code><find_all></code> tag in your document, so <code>None</code> is returned for that.

Difference between "findAll" and "find_all" in BeautifulSoup

Tags:

python

xml-parsing

html-parsing

beautifulsoup

I would like to parse an HTML file with Python, and the module I am using is BeautifulSoup.

It is said that the function find_all is the same as findAll. I've tried both of them, but I believe they are different:

import urllib, urllib2, cookielib from BeautifulSoup import * site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407"  rqstr = urllib2.Request(site) rq = urllib2.urlopen(rqstr) fchData = rq.read()  soup = BeautifulSoup(fchData)  t = soup.findAll('tr')

Can anyone tell me the difference?

873

asked Sep 09 '12 13:09

Oberon

1 Answers

In BeautifulSoup version 4, the methods are exactly the same; the mixed-case versions (findAll, findAllNext, nextSibling, etc.) have all been renamed to conform to the Python style guide, but the old names are still available to make porting easier. See Method Names for a full list.

In new code, you should use the lowercase versions, so find_all, etc.

In your example however, you are using BeautifulSoup version 3 (discontinued since March 2012, don't use it if you can help it), where only findAll() is available. Unknown attribute names (such as .find_all, which only is available in BeautifulSoup 4) are treated as if you are searching for a tag by that name. There is no <find_all> tag in your document, so None is returned for that.

115

answered Oct 02 '22 00:10

Martijn Pieters

Related questions
                            
                                Capture Control-C in Python
                            
                                How to get column by number in Pandas?
                            
                                Python and how to get text from Selenium element WebElement object?
                            
                                How is `min` of two integers just as fast as 'bit hacking'?
                            
                                How do you run a python script from within notepad++? [duplicate]
                            
                                Tool for pinpointing circular imports in Python/Django?
                            
                                Django 1.7 - How do I suppress "(1_6.W001) Some project unittests may not execute as expected."?
                            
                                ProgrammingError: relation "django_session" does not exist error after installing Psycopg2
                            
                                How can I create a dropdown menu from a List in Tkinter?
                            
                                How to create a stacked bar chart for my DataFrame using seaborn [duplicate]
                            
                                Pandas query function not working with spaces in column names
                            
                                How to use dynamic foreignkey in Django?
                            
                                Handling \r\n vs \n newlines in python on Mac vs Windows
                            
                                Turn off caching of static files in Django development server
                            
                                How to install matplotlib with Python3.2
                            
                                sorting a counter in python by keys
                            
                                Insert a link inside a Pandas table
                            
                                Get the description of a status code in Python Requests
                            
                                Idiomatic way to do list/dict in Cython?
                            
                                How to store the result of an executed shell command in a variable in python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With