Assuming that the following code:
for data in soup.findAll('div',{'class':'value'}):
print(data)
gives the following output:
<div class="value">
<p class="name">Michael Jordan</p>
</div>
<div class="value">
<p class="team">Real Madrid</p>
</div>
<div class="value">
<p class="Sport">Ping Pong</p>
</div>
I want to create the following dictionary:
Person = {'name': 'Michael Jordan', 'team': 'Real Madrid', 'Sport': 'Ping Pong'}
I can get the text using data.text
but how can I get the text of the class
in order to name the keys
of the dictionary(Person[key1],Person[key2] ...)?
Create an HTML document and specify the '<p>' tag into the code. Pass the HTML document into the Beautifulsoup() function. Use the 'P' tag to extract paragraphs from the Beautifulsoup object. Get text from the HTML document with get_text().
find() method The find method is used for finding out the first tag with the specified name or id and returning an object of type bs4. Example: For instance, consider this simple HTML webpage having different paragraph tags.
You could use the following:
content = '''
<div class="value">
<p class="name">Michael Jordan</p>
</div>
<div class="value">
<p class="team">Real Madrid</p>
</div>
<div class="value">
<p class="Sport">Ping Pong</p>
</div>
'''
from bs4 import BeautifulSoup
soup = BeautifulSoup(content)
person = {}
for div in soup.findAll('div', {'class': 'value'}):
person[div.find('p').attrs['class'][0]] = div.text.strip()
print(person)
Output
{'Sport': u'Ping Pong', 'name': u'Michael Jordan', 'team': u'Real Madrid'}
You can do iit like this:
for data in soup.findAll('div',{'class':'value'}):
person = {}
for item in data.find_all('div'):
attr = item.p.attrs.get("class")[0]
value = item.p.text
person[attr] = value
print person
Using this snippet
soup = <div class="value">
<p class="Sport other-name-class other">Ping Pong</p>
</div>
p = soup.find('div.value p')
I found two ways but It is the same, you can use
p.get_attribute_list('class')
or
p.attrs['class']
both return array with all class name, like this ['Sport', 'other-name-class', 'other']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With