How find specific data attribute from html tag in BeautifulSoup4?

Question

Is there a way to find an element using only the data attribute in html, and then grab that value?

For example, with this line inside an html doc:

<ul data-bin="Sdafdo39">

How do I retrieve Sdafdo39 by searching the entire html doc for the element that has the data-bin attribute?

xecgr · Accepted Answer

A little bit more accurate

[item['data-bin'] for item in bs.find_all('ul', attrs={'data-bin' : True})]

This way, the iterated list only has the ul elements that has the attr you want to find

from bs4 import BeautifulSoup
bs = BeautifulSoup(html_doc)
html_doc = """<ul class="foo">foo</ul><ul data-bin="Sdafdo39">"""
[item['data-bin'] for item in bs.find_all('ul', attrs={'data-bin' : True})]

thefourtheye · Answer

You can use find_all method to get all the tags and filtering based on "data-bin" found in its attributes will get us the actual tag which has got it. Then we can simply extract the value corresponding to it, like this

from bs4 import BeautifulSoup
html_doc = """<ul data-bin="Sdafdo39">"""
bs = BeautifulSoup(html_doc)
print [item["data-bin"] for item in bs.find_all() if "data-bin" in item.attrs]
# ['Sdafdo39']

emehex · Answer

You could solve this with gazpacho in just a couple of lines:

First, import and turn the html into a Soup object:

from gazpacho import Soup

html = """<ul data-bin="Sdafdo39">"""
soup = Soup(html)

Then you can just search for the "ul" tag and extract the href attribute:

soup.find("ul").attrs["data-bin"]
# Sdafdo39

How find specific data attribute from html tag in BeautifulSoup4?

Tags:

python

html

beautifulsoup

web-scraping

user21398

3 Answers

xecgr

thefourtheye

emehex

Recent Activity

Donate For Us

How find specific data attribute from html tag in BeautifulSoup4?

Tags:

python

html

beautifulsoup

web-scraping

user21398

3 Answers

xecgr

thefourtheye

emehex

Related questions

Recent Activity

Donate For Us