I have this link:
<a href="/location/santa-clara/3fce50c4f3f9793d2f503fc145585090">Santa Clara, California</a>
How can I use BeautifulSoup to find specifically this link that includes the word location "location"?
To find elements that contain a specific text in Beautiful Soup, we can use find_all(~) method together with a lambda function.
find() method The find method is used for finding out the first tag with the specified name or id and returning an object of type bs4. Example: For instance, consider this simple HTML webpage having different paragraph tags.
Steps to be followed:get() method by passing URL to it. Create a Parse Tree object i.e. soup object using of BeautifulSoup() method, passing it HTML document extracted above and Python built-in HTML parser. Use the a tag to extract the links from the BeautifulSoup object.
You can do it with a simple "contains" CSS selector:
soup.select("a[href*=location]")
Or, if only one link needs to be matched, use select_one()
:
soup.select_one("a[href*=location]")
And, of course, there are many other ways - for instance, you can use find_all()
providing the href
argument which can have a regular expression value or a function:
import re soup.find_all("a", href=re.compile("location")) soup.find_all("a", href=lambda href: href and "location" in href)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With