Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selenium Python Unable to scroll down, while fetching google reviews

I am trying to fetch google reviews with the help of selenium in python. I have imported webdriver from selenium python module. Then I have initialized self.driver as follows:-

self.driver = webdriver.Chrome(executable_path="./chromedriver.exe",chrome_options=webdriver.ChromeOptions())

After this I am using the following code to type the company name on google homepage whose reviews I need, for now I am trying to fetch reviews for "STANLEY BRIDGE CYCLES AND SPORTS LIMITED ":-

company_name = self.driver.find_element_by_name("q")
company_name.send_keys("STANLEY BRIDGE CYCLES AND SPORTS LIMITED ")
time.sleep(2)

After this to click on the google search button, using the following code:-

self.driver.find_element_by_name("btnK").click()
time.sleep(2)

Then finally I am on the page where I can see results. Now I want to click on the View on google reviews button. For that using the following code:-

self.driver.find_elements_by_link_text("View all Google reviews")[0].click()
time.sleep(2)

Now I am able to get reviews, but only 10. I need at least 20 reviews for a company. For that I am trying to scroll the page down using the following code: self.driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") time.sleep(5)

Even while using the above code to scroll the down the page, I am still getting only 10 reviews. I am not getting any error though.

Need help on how to scroll down the page to get atleast 20 reviews. As of now I am able to get only 10 reviews. Based on my online search for this issue, people have mostly used: "driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")" to scroll the page down whenever required. But for me this is not working. I checked the the height of the page before and after ("driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")") is the same.

like image 778
Nidhi Arora Avatar asked Oct 16 '22 09:10

Nidhi Arora


1 Answers

Use Javascript to scroll to the last review, this will trigger additional review load.

last_review = self.driver.find_element_by_css_selector('div.gws-localreviews__google-review:last-of-type')
self.driver.execute_script('arguments[0].scrollIntoView(true);', last_review)

EDIT:

The following example is working correctly for me on Firefox and Chrome, you can reuse the extract google reviews function for your needs

import time

from selenium import webdriver
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.wait import WebDriverWait


def extract_google_reviews(driver, query):
    driver.get('https://www.google.com/?hl=en')
    driver.find_element_by_name('q').send_keys(query)
    WebDriverWait(driver, 5).until(EC.element_to_be_clickable((By.NAME, 'btnK'))).click()

    reviews_header = driver.find_element_by_css_selector('div.kp-header')
    reviews_link = reviews_header.find_element_by_partial_link_text('Google reviews')
    number_of_reviews = int(reviews_link.text.split()[0])
    reviews_link.click()

    all_reviews = WebDriverWait(driver, 3).until(EC.presence_of_all_elements_located((By.CSS_SELECTOR, 'div.gws-localreviews__google-review')))
    while len(all_reviews) < number_of_reviews:
        driver.execute_script('arguments[0].scrollIntoView(true);', all_reviews[-1])
        WebDriverWait(driver, 5, 0.25).until_not(EC.presence_of_element_located((By.CSS_SELECTOR, 'div[class$="activityIndicator"]')))
        all_reviews = driver.find_elements_by_css_selector('div.gws-localreviews__google-review')

    reviews = []
    for review in all_reviews:
        try:
            full_text_element = review.find_element_by_css_selector('span.review-full-text')
        except NoSuchElementException:
            full_text_element = review.find_element_by_css_selector('span[class^="r-"]')
        reviews.append(full_text_element.get_attribute('textContent'))

    return reviews

if __name__ == '__main__':
    try:
        driver = webdriver.Firefox()
        reviews = extract_google_reviews(driver, 'STANLEY BRIDGE CYCLES AND SPORTS LIMITED')
    finally:
        driver.quit()

    print(reviews)
like image 141
Dalvenjia Avatar answered Oct 21 '22 07:10

Dalvenjia