Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BeautifulSoup to find a link that contains a specific word

Tags:

I have this link:

<a href="/location/santa-clara/3fce50c4f3f9793d2f503fc145585090">Santa Clara, California</a> 

How can I use BeautifulSoup to find specifically this link that includes the word location "location"?

like image 525
Morgan Allen Avatar asked Jul 07 '16 18:07

Morgan Allen


People also ask

How do I find a specific text in BeautifulSoup?

To find elements that contain a specific text in Beautiful Soup, we can use find_all(~) method together with a lambda function.

What is Find () method in BeautifulSoup?

find() method The find method is used for finding out the first tag with the specified name or id and returning an object of type bs4. Example: For instance, consider this simple HTML webpage having different paragraph tags.

How do I extract links from a website in BeautifulSoup?

Steps to be followed:get() method by passing URL to it. Create a Parse Tree object i.e. soup object using of BeautifulSoup() method, passing it HTML document extracted above and Python built-in HTML parser. Use the a tag to extract the links from the BeautifulSoup object.


1 Answers

You can do it with a simple "contains" CSS selector:

soup.select("a[href*=location]") 

Or, if only one link needs to be matched, use select_one():

soup.select_one("a[href*=location]") 

And, of course, there are many other ways - for instance, you can use find_all() providing the href argument which can have a regular expression value or a function:

import re  soup.find_all("a", href=re.compile("location")) soup.find_all("a", href=lambda href: href and "location" in href) 
like image 85
alecxe Avatar answered Sep 29 '22 12:09

alecxe