Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find all elements with a custom html attribute regardless of html tag using Beautiful Soup?

Tags:

I have two cases where i want to scrape html tags with custom html attributes This is the example of the html. How do you scrape all the elements with the custom attribute "limit".

<div class="names" limit="10">Bar</div>  <div id="30" limit="20">Foo</div>  <li limit="x">Baz</li> 

The second case is similar but with all the same html tags

<div class="names" limit="10">Bar</div>  <div class="names" limit="20">Bar</div>  <div class="names" limit="30">Bar</div>  

My question is different than How to find tags with only certain attributes - BeautifulSoup because the latter targets attribute values with a specific tag whereas my question finds attributes only regardless of tag or value

like image 461
Dap Avatar asked Jul 14 '15 20:07

Dap


People also ask

How do I find attributes in HTML?

Attributes are always specified in the start tag (or opening tag) and usually consists of name/value pairs like name="value" . Attribute values should always be enclosed in quotation marks.

Can HTML elements have custom attributes?

Every HTML element may have any number of custom data attributes specified, with any value.


1 Answers

# First case: soup.find_all(attrs={"limit":True})  # Second case: soup.find_all("div", attrs={"limit":True}) 

Reference:

  • http://www.crummy.com/software/BeautifulSoup/bs4/doc/#kwargs
  • http://www.crummy.com/software/BeautifulSoup/bs4/doc/#find-all

If your attribute name doesn't collide with either Python keywords or soup.find_all named args, the syntax is simpler:

soup.find_all(id=True) 
like image 114
Robᵩ Avatar answered Oct 26 '22 06:10

Robᵩ