Using BeautifulSoup to select div blocks within HTML

Question

I am trying to parse several div blocks using Beautiful Soup using some html from a website. However, I cannot work out which function should be used to select these div blocks. I have tried the following:

import urllib2
from bs4 import BeautifulSoup

def getData():

    html = urllib2.urlopen("http://www.racingpost.com/horses2/results/home.sd?r_date=2013-09-22", timeout=10).read().decode('UTF-8')

    soup = BeautifulSoup(html)

    print(soup.title)
    print(soup.find_all('<div class="crBlock ">'))

getData()

I want to be able to select everything between <div class="crBlock "> and its correct end </div>. (Obviously there are other div tags but I want to select the block all the way down to the one that represents the end of this section of html.)

Wiwiweb · Accepted Answer

The correct use would be:

soup.find_all('div', class_="crBlock ")

By default, beautiful soup will return the entire tag, including contents. You can then do whatever you want to it if you store it in a variable. If you are only looking for one div, you can also use find() instead. For instance:

div = soup.find('div', class_="crBlock ")
print(div.find_all(text='foobar'))

Check out the documentation page for more info on all the filters you can use.

Using BeautifulSoup to select div blocks within HTML

Tags:

python

html

beautifulsoup

urllib2

python-2.7

SMNALLY

1 Answers

Wiwiweb

Recent Activity

Donate For Us

Using BeautifulSoup to select div blocks within HTML

Tags:

python

html

beautifulsoup

urllib2

python-2.7

SMNALLY

1 Answers

Wiwiweb

Related questions

Recent Activity

Donate For Us