Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between "findAll" and "find_all" in BeautifulSoup

I would like to parse an HTML file with Python, and the module I am using is BeautifulSoup.

It is said that the function find_all is the same as findAll. I've tried both of them, but I believe they are different:

import urllib, urllib2, cookielib from BeautifulSoup import * site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407"  rqstr = urllib2.Request(site) rq = urllib2.urlopen(rqstr) fchData = rq.read()  soup = BeautifulSoup(fchData)  t = soup.findAll('tr') 

Can anyone tell me the difference?

like image 873
Oberon Avatar asked Sep 09 '12 13:09

Oberon


People also ask

What is the difference between Find_all and findAll?

find is used for returning the result when the searched element is found on the page. find_all is used for returning all the matches after scanning the entire document. It is used for getting merely the first tag of the incoming HTML object for which condition is satisfied.

What does Find_all do in Python?

find_all returns an object of ResultSet which offers index based access to the result of found occurrences and can be printed using a for loop. Unwanted values These are not desired most of the time. So, attributes like id , class , or value are used to further refine the search.

How does findAll work BeautifulSoup?

findAll("p", {"class": "pagination-container and something"}) , BeautifulSoup would match an element having the exact class attribute value. There is no splitting involved in this case - it just sees that there is an element where the complete class value equals the desired string.


1 Answers

In BeautifulSoup version 4, the methods are exactly the same; the mixed-case versions (findAll, findAllNext, nextSibling, etc.) have all been renamed to conform to the Python style guide, but the old names are still available to make porting easier. See Method Names for a full list.

In new code, you should use the lowercase versions, so find_all, etc.

In your example however, you are using BeautifulSoup version 3 (discontinued since March 2012, don't use it if you can help it), where only findAll() is available. Unknown attribute names (such as .find_all, which only is available in BeautifulSoup 4) are treated as if you are searching for a tag by that name. There is no <find_all> tag in your document, so None is returned for that.

like image 115
Martijn Pieters Avatar answered Oct 02 '22 00:10

Martijn Pieters