Extract class name in scrapy

Question

I am trying to scrape rating off of trustpilot.com.

Is it possible to extract a class name using scrapy? I am trying to scrape a rating which is made up of five individual images but the images are in a class with the name of the rating for example if the rating is 2 starts then:

<div class="star-rating count-2 size-medium clearfix">...

if it is 3 stars then:

<div class="star-rating count-3 size-medium clearfix">...

So is there a way I can scrape the class count-2 or count-3 assuming a selector like .css('.star-rating')?

Jan · Accepted Answer

You could use a combination of both somewhere in your code:

import re

classes = response.css('.star-rating').xpath("@class").extract()
for cls in classes:
    match = re.search(r'\bcount-\d+\b', cls)
    if match:
        print("Class = {}".format(match.group(0))

gangabass · Answer

You can extract rating directly using re_first() and re():

for rating in response.xpath('//div[contains(@class, "star-rating")]/@class').re(r'count-(\d+)'):
    print(rating)

Extract class name in scrapy

Tags:

python

css-selectors

web-scraping

scrapy

Dan

2 Answers

Jan

gangabass

Recent Activity

Donate For Us

Extract class name in scrapy

Tags:

python

css-selectors

web-scraping

scrapy

Dan

2 Answers

Jan

gangabass

Related questions

Recent Activity

Donate For Us