Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beautiful Soup: Parsing only one element

I keep running into walls, but feel like I'm close here.

HTML block being harvested:

div class="details">
   <div class="price">
   <h3>From</h3>
   <strike data-round="true" data-currency="USD" data-price="148.00" title="US$148 ">€136</strike>
   <span data-round="true" data-currency="USD" data-price="136.00" title="US$136 ">€125</span>
</div>

I would like to parse out the "US$136" value alone (span data). Here is my logic so far, which captures both 'span data' and 'strike-data:

price = item.find_all("div", {"class": "price"})
        price_final = (price[0].text.strip()[8:])
        print(price_final)

Any feedback is appreciated:)

like image 532
Serious Ruffy Avatar asked Oct 05 '15 15:10

Serious Ruffy


1 Answers

price in your case is a ResultSet - list of div tags having price class. Now you need to locate a span tag inside every result (assuming there are multiple prices you want to match):

prices = item.find_all("div", {"class": "price"})
for price in prices:
    price_final = price.span.text.strip()
    print(price_final)

If there is only once price you need to find:

soup.find("div", {"class": "price"}).span.get_text()

or with a CSS selector:

soup.select_one("div.details div.price span").get_text()

Note that, if you want to use select_one(), install the latest beautifulsoup4 package:

pip install --upgrade beautifulsoup4
like image 120
alecxe Avatar answered Sep 28 '22 19:09

alecxe