Using lambda functions in beautiful soup

Question

Trying to match links that contain certain texts. I'm doing

links = soup.find_all('a',href=lambda x: ".org" in x)

But that throws a TypeError: argument of type 'NoneType' is not iterable.

The correct way of doing it is apparantly

links = soup.find_all('a',href=lambda x: x and ".org" in x)

Why is the additional x and necessary here?

Aran-Fey · Accepted Answer

There's a simple reason: One of the <a> tags in your HTML has no href property.

Here's a minimal example that reproduces the exception:

html = '<html><body><a>bar</a></body></html>'
soup = BeautifulSoup(html, 'html.parser')

links = soup.find_all('a', href=lambda x: ".org" in x)
# result:
# TypeError: argument of type 'NoneType' is not iterable

Now if we add a href property, the exception disappears:

html = '<html><body><a href="foo.org">bar</a></body></html>'
soup = BeautifulSoup(html, 'html.parser')

links = soup.find_all('a', href=lambda x: ".org" in x)
# result:
# [<a href="foo.org">bar</a>]

What's happening is that BeautifulSoup is trying to access the <a> tag's href property, and that returns None when the property doesn't exist:

html = '<html><body><a>bar</a></body></html>'
soup = BeautifulSoup(html, 'html.parser')

print(soup.a.get('href'))
# output: None

This is why it's necessary to allow None values in your lambda. Since None is a falsy value, the code x and ... prevents the right side of the and statement from being executed when x is None, as you can see here:

>>> None and 1/0
>>> 'foo.org' and 1/0
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ZeroDivisionError: division by zero

This is called short-circuiting.

That said, x and ... checks the truthiness of x, and None is not the only value that's considered falsy. So it would be more correct to compare x to None like so:

lambda x: x is not None and ".org" in x

Using lambda functions in beautiful soup

Tags:

python

beautifulsoup

shem

1 Answers

Aran-Fey

Recent Activity

Donate For Us

Using lambda functions in beautiful soup

Tags:

python

beautifulsoup

shem

1 Answers

Aran-Fey

Related questions

Recent Activity

Donate For Us