How to extract value from span tag

Tags:

I am writing a simple web scraper to extract the game times for the ncaa basketball games. The code doesn't need to be pretty, just work. I have extracted the value from other span tags on the same page but for some reason I cannot get this one working.

from bs4 import BeautifulSoup as soup
import requests

url = 'http://www.espn.com/mens-college-basketball/game/_/id/401123420'
response = requests.get(url)
soupy = soup(response.content, 'html.parser')

containers = soupy.findAll("div",{"class" : "team-container"})
for container in containers:
    spans = container.findAll("span")
    divs = container.find("div",{"class": "record"})
    ranks = spans[0].text
    team_name = spans[1].text
    team_mascot = spans[2].text
    team_abbr = spans[3].text
    team_record = divs.text
    time_container = soupy.find("span", {"class":"time game-time"})
    game_times = time_container.text
    refs_container = soupy.find("div", {"class" : "game-info-note__container"})
    refs = refs_container.text
    print(ranks)
    print(team_name)
    print(team_mascot)
    print(team_abbr)
    print(team_record)
    print(game_times)
    print(refs)

The specific code I am concerned about is this,

 time_container = soupy.find("span", {"class":"time game-time"})
    game_times = time_container.text

I just provided the rest of the code to show that the .text on other span tags work. The time is the only data I truly want. I just get an empty string with how my code is currently.

This is the output of the code I get when I call time_container

<span class="time game-time" data-dateformat="time1" data-showtimezone="true"></span>

or just '' when I do game_times.

Here is the line of the HTML from the website:

<span class="time game-time" data-dateformat="time1" data-showtimezone="true">6:10 PM CT</span>

I don't understand why the 6:10 pm is gone when I run the script.

968

asked Apr 09 '19 22:04

zezima

2 Answers

The site is dynamic, thus, you need to use selenium:

from selenium import webdriver
d = webdriver.Chrome('/path/to/chromedriver')
d.get('http://www.espn.com/mens-college-basketball/game/_/id/401123420')
game_time = soup(d.page_source, 'html.parser').find('span', {'class':'time game-time'}).text

Output:

'7:10 PM ET'

See full selenium documentation here.

answered Oct 22 '22 22:10

Ajax1234

An alternative would be to use some of ESPN's endpoints. These endpoints will return JSON responses. https://site.api.espn.com/apis/site/v2/sports/basketball/mens-college-basketball/scoreboard

You can see other endpoints at this GitHub link https://gist.github.com/akeaswaran/b48b02f1c94f873c6655e7129910fc3b

This will make your application pretty light weight compared to running Selenium.

I recommend opening up inspect and going to the network tab. You can see all sorts of cool stuff happening. You can see all the requests that are happening in the site.

answered Oct 23 '22 00:10

Jose Ortiz

Related questions
                            
                                OpenCV: Calculate Angle between camera and object
                            
                                python - Year-week combination for the end or beginning of a year
                            
                                Approve a CSR in Kuberentes Using the Python client
                            
                                Instance of 'OneToOneField' has no 'username' member
                            
                                How to dynamically remove a decorator from a function?
                            
                                How to set a prefix for all print() output in python?
                            
                                Converting "year" and "week of year" columns to "date" in Pandas
                            
                                why using cv2.calcHist always has an errer "returned NULL without setting an error"
                            
                                Pandas: How to read specific rows from a CSV file
                            
                                Locally disable warnings of Python Language Server in Visual Studio Code
                            
                                WEBP support not installed error with Pillow included in Anaconda
                            
                                How to do alpha matting in python
                            
                                Find number of non-zero elements adjacent to zeros in numpy 2D array
                            
                                Cube root of a very large number using only math library
                            
                                "Expected a list of items but got type \"dict\"."
                            
                                How to run TF object detection API model_main.py in evaluation mode only
                            
                                Spawn actor from class in Unreal Engine using Python
                            
                                Is it normal to have a settings file for each staging instance/version in a Django project?
                            
                                How can I embed Superset Apache into Flask web app?
                            
                                How to link interactive problems (w.r.t. CodeJam)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to extract value from span tag

Tags:

python

html

beautifulsoup

web-scraping

zezima

People also ask

2 Answers

Ajax1234

Jose Ortiz

Recent Activity

Donate For Us