Beautiful Soup Select Vs Find_all data Type

Tags:

python

beautifulsoup

I am new to webscraping, and there seems to be two ways to gather ALL html data I am looking for.

option_1 = soup.find_all('div', class_='p')

option_2 = soup.select('div.p')

I see that option_1 returns class 'bs4.element.ResultSet' and option_2 returns class 'list'

I can still iterate through option_1 with a for loop, so what is the difference between:

select and find_all
'list' and bs4.element.ResultSet

417

asked Oct 19 '17 18:10

Mwspencer

1 Answers

You should find the answer to your first question here (linked by t-m-adam in the comments).

As for the second question let's take a look at the source code :)

class ResultSet(list):
    """A ResultSet is just a list that keeps track of the SoupStrainer
    that created it."""
    def __init__(self, source, result=()):
        super(ResultSet, self).__init__(result)
        self.source = source

    def __getattr__(self, key):
        raise AttributeError(
            "ResultSet object has no attribute '%s'. You're probably treating a list of items like a single item. Did you call find_all() when you meant to call find()?" % key
        )

ResultSet is just a subclass of list used to store results of find_all() method.

answered Oct 06 '22 00:10

radzak

Related questions
                            
                                sqlalchemy CompileError Unconsumed column names when deleting row from m2m table
                            
                                Calculate how a value differs from the average of values using the Gaussian Kernel Density (Python)
                            
                                How can I show verbose py.test diffs without verbose test progress?
                            
                                PEP 0008: What does the BDFL mean by 'in true XP style'?
                            
                                How to randomly shuffle a list that has more permutations than the PRNG's period?
                            
                                Difference between list comprehension and generator comprehension with `yield` inside
                            
                                Computing symmetric Kullback-Leibler divergence between two documents
                            
                                How to connect django to docker redis container?
                            
                                OpenCV - specify format while writing image to file (cv2.imwrite)
                            
                                How to plot Pandas datetime series in Seaborn distplot?
                            
                                Difference between numpy ediff1d and diff
                            
                                How can I get a similar summary of a Pandas dataframe as in R?
                            
                                How to determine the number of interned strings in Python 2.7.5?
                            
                                Is there a way compile protocol buffers into pure python code?
                            
                                Reading csv from S3 and inserting into a MySQL table with AWS Lambda
                            
                                How to establish a SSH connection via proxy using Fabric?
                            
                                TensorFlow tf.reshape Fortran order (like numpy)
                            
                                It is possible to generate sequence diagram from python code?
                            
                                CeleryBeat Process consumes all OS memory
                            
                                Pylint message about module length reasoning and ratio of docstrings to lines of code

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With