I'm using BeautifulSoup to search for several elements in a web page.
I'm saving the elements I find, but because there is a chance that my script will look for an element and it doesn't exist for the particular page it's parsing, I have try/except statements for every element:
# go through a bunch of webpages
for soup in soups:
try: # look for HTML element
data['val1'].append(soup.find('div', class_="something").text)
except: # add NA if nothing found
data['val1'].append("N/A")
try:
data['val2'].append(soup.find('span', class_="something else").text)
except:
data['val2'].append("N/A")
# and more and more try/excepts for more elements of interest
Is there a cleaner or better way to write something like this?
According to the documentation about the find method. It will return None if can’t find anything. So the Exception occurs when you call the property 'text' of None.
Maybe you should take a look at the ternary operator in Python to see how you can do it.
result = soup.find('div', class_="something")
data['val1'].append(result.text if result else "N/A")
Also as Dan-Dev pointed out catching an exception is expensive:
A try/except block is extremely efficient if no exceptions are raised. Actually catching an exception is expensive.
Try except is expensive. I'd use an if else statement.
v = soup.find('div', class_="something")
if v:
data['val1'].append(v.text)
else:
data['val1'].append("N/A")
This achieves what you want and also reduces code repetition a bit more by wrapping things in a for loop:
info= [("val1", "div", "something"),
("val2", "span", "something else")]
# go through a bunch of webpages
for soup in soups:
for (val, element, class1) in info:
query = soup.find(element, class_=class1)
data[val].append(query.text if query else "N/A")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With