I am parsing HTML table with BeautifulSoup like this:
for tr in table_body.find_all('tr'):
for td in tr:
if td.text == 'Description':
description = td.find_next('td').text
if td.text == 'Category':
category = td.find_next('td').text
if td.text == 'Department':
department = td.find_next('td').text
if td.text == 'Justification':
justification = td.find_next('td').text
print(description, category, department, justification)
I refactored the multiple if statements into a function:
def html_check(td, text):
if td.text == text:
value = td.find_next('td').text
return value
that is called like this:
for tr in table_body.find_all('tr'):
for td in tr:
description= html_check(td, 'Description')
category = html_check(td, 'Category')
department = html_check(td, 'Department')
justification = html_check(td, 'Justification')
print(description, category, department, justification)
My problem is that when the function html_check will not find a match, it will return None, which will be printed. This is not desirable.
Is there any way to make this function return a value only when the if condition in it is met?
Python will always return None, if no return is specified at the point of exiting the function call. Your options are:
Noneoption 1 (return something else when the condition isn't met):
def html_check(td, text):
if td.text == text:
value = td.find_next('td').text
return value
return "no value found"
option 2 (ignores the function if None is returned):
if html_check(td, 'section'):
# do things
You can specify a default value, to return, in case no element matches. Something like:
def html_check(td, text):
if td.text == text:
value = td.find_next('td').text
return value
return "Default Value"
Also, you can spcify the default value via argument, which would somewhat look like :
def html_check(td, text, default_value):
if td.text == text:
value = td.find_next('td').text
return value
return default_value
And, then use it like:
for tr in table_body.find_all('tr'):
for td in tr:
description= html_check(td, 'Description', 'Default Description')
category = html_check(td, 'Category','Default Category')
department = html_check(td, 'Department', 'Default Department')
justification = html_check(td, 'Justification', 'Default Justification')
print(description, category, department, justification)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With