Use BeautifulSoup to get a value after a specific tag

Tags:

I'm having a very hard time getting BeautifulSoup to scrape some data for me. What's the best way to access the date (the actual numbers, 2008) from this code sample? It's my first time using Beautifulsoup, I've figured out how to scrape urls off of the page, but I can't quite narrow it down to only select the word Date, and then to only return whatever numeric date follows (in the dd brackets). Is what I'm asking even possible?

<div class='dl_item_container clearfix detail_date'>
    <dt>Date</dt>
    <dd>
        2008
    </dd>
</div>

899

asked Sep 11 '14 03:09

knames

1 Answers

Find the dt tag by text and find the next dd sibling:

soup.find('div', class_='detail_date').find('dt', text='Date').find_next_sibling('dd').text

The complete code:

from bs4 import BeautifulSoup

data = """
<div class='dl_item_container clearfix detail_date'>
    <dt>Date</dt>
    <dd>
    2008
    </dd>
</div>
"""

soup = BeautifulSoup(data, 'html.parser')
date_field = soup.find('div', class_='detail_date').find('dt', text='Date')
print(date_field.find_next_sibling('dd').text.strip())

Prints 2008.

178

answered Nov 03 '22 21:11

alecxe

Related questions
                            
                                Using variable as keyword passed to **kwargs in Python
                            
                                ElementTree findall 'or' operator
                            
                                Does enumerate create a copy of its argument?
                            
                                How can I avoid value errors when using numpy.random.multinomial?
                            
                                Pass **kwargs if not none
                            
                                Two Y-scales in pyqtgraph (twinx-like)
                            
                                Wtforms, Multi selection file upload
                            
                                Inverse function of numpy.polyval()
                            
                                Calculate time difference using Python [duplicate]
                            
                                How can I clear the Python pdb screen?
                            
                                SSLError (Read operation timed out) in Python requests
                            
                                No module named machinery
                            
                                How to split only on carriage returns with readlines in python?
                            
                                Import error for Oauth
                            
                                How do I configure sqlalchemy to correctly store emoji?
                            
                                Django makemigrations works, migrate fails with "django.db.utils.IntegrityError: NOT NULL constraint failed"
                            
                                having cv2.imread reading images from file objects or memory-stream-like data (here non-extracted tar)
                            
                                How to get shapefile geometry type in PyQGIS?
                            
                                How to paste from clipboard on Heroku iPython?
                            
                                .format() returns ValueError when using {0:g} to remove trailing zeros

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Use BeautifulSoup to get a value after a specific tag

Tags:

python

html-parsing

beautifulsoup

web-scraping

knames

People also ask

1 Answers

alecxe

Recent Activity

Donate For Us