Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Web Scraping: Beautiful Soup

I have a problem with the scraping of a web page. I'm trying to get the difference of points (Ex: +2,+1,...) between two teams but when I apply the find_all method it returns an empty list...

from bs4 import BeautifulSoup
from requests import get
url='https://www.mismarcadores.com/partido/Q942gje8/#punto-a-punto;1'
response=get(url)
html_soup=BeautifulSoup(response.text,'html.parser')


html_soup.find_all('span',class_='match-history-diff-score-inc')
like image 848
Víctor Avatar asked Dec 03 '22 11:12

Víctor


1 Answers

The problem you have is the web content is being generated dynamically through JavaScript. As such, requests is unable to handle it, and so you'd be better off using something like Selenium.

EDIT: Per @λuser's suggestion, I've modified my answer to only use Selenium by searching for the elements you're looking for by XPath. Note that I used the XPath function starts-with() to get both match-history-diff-score-dec and match-history-diff-score-inc. Selecting only one of them was making you miss out on almost half of the relative score updates. This is why the output yields 103 results instead of 56.

from selenium import webdriver

driver = webdriver.Chrome()
driver.get("https://www.mismarcadores.com/partido/Q942gje8/#punto-a-punto;1")

table = driver.find_elements_by_xpath('//td//span[starts-with(@class, "match-history-diff-score-")]')

results = []
for tag in table:
    print(tag.get_attribute('innerHTML'))
print(results)

This outputs:

['+2', '+1', '+2', '+2', '+1', '+2', '+4', '+2', '+2', '+4', '+7', '+5', '+8', '+5', '+7', '+5', '+3', '+2', '+5', '+3', '+5', '+3', '+5', '+6', '+4', '+6', '+7', '+6', '+5', '+2', '+4', '+2', '+5', '+7', '+6', '+8', '+5', '+3', '+1', '+2', '+1', '+4', '+7', '+5', '+8', '+6', '+9', '+11', '+10', '+9', '+11', '+9', '+10', '+11', '+9', '+7', '+5', '+3', '+2', '+1', '+3', '+1', '+3', '+2', '+1', '+3', '+2', '+4', '+1', '+2', '+3', '+6', '+3', '+5', '+2', '+1', '+1', '+2', '+4', '+3', '+2', '+4', '+1', '+3', '+5', '+7', '+5', '+8', '+7', '+6', '+5', '+4', '+1', '+4', '+6', '+9', '+7', '+9', '+7', '+10', '+11', '+12', '+10']
like image 61
Mihai Chelaru Avatar answered Dec 25 '22 22:12

Mihai Chelaru