Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup: Return None if HTML element not found

I'm using BeautifulSoup to search for several elements in a web page.

I'm saving the elements I find, but because there is a chance that my script will look for an element and it doesn't exist for the particular page it's parsing, I have try/except statements for every element:

# go through a bunch of webpages
for soup in soups:
    try: # look for HTML element
         data['val1'].append(soup.find('div', class_="something").text)
    except: # add NA if nothing found
        data['val1'].append("N/A")
    try:
        data['val2'].append(soup.find('span', class_="something else").text)
    except:
        data['val2'].append("N/A")

    # and more and more try/excepts for more elements of interest

Is there a cleaner or better way to write something like this?

like image 973
David Skarbrevik Avatar asked Jun 02 '18 23:06

David Skarbrevik


3 Answers

According to the documentation about the find method. It will return None if can’t find anything. So the Exception occurs when you call the property 'text' of None.

Maybe you should take a look at the ternary operator in Python to see how you can do it.

result = soup.find('div', class_="something")
data['val1'].append(result.text if result else "N/A")

Also as Dan-Dev pointed out catching an exception is expensive:

A try/except block is extremely efficient if no exceptions are raised. Actually catching an exception is expensive.

like image 148
Francisco Jiménez Cabrera Avatar answered Oct 10 '22 07:10

Francisco Jiménez Cabrera


Try except is expensive. I'd use an if else statement.

v = soup.find('div', class_="something")
if v:
    data['val1'].append(v.text)
else:
    data['val1'].append("N/A")
like image 38
Dan-Dev Avatar answered Oct 10 '22 09:10

Dan-Dev


This achieves what you want and also reduces code repetition a bit more by wrapping things in a for loop:

info= [("val1", "div", "something"),
       ("val2", "span", "something else")]

# go through a bunch of webpages
for soup in soups:
    for (val, element, class1) in info:
        query = soup.find(element, class_=class1)
        data[val].append(query.text if query else "N/A")
like image 2
iacob Avatar answered Oct 10 '22 09:10

iacob