Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't my html parser download the entire html documentation?

I am using Beautiful Soup to scrape the following page: https://www.nyse.com/quote/XNYS:AAN

I want to the stock value below the name + abbreviation. However, when I run a script, it seems that soup.find() does not work because the entire html file is not being downloaded.

main_url = "https://www.nyse.com/quote/XNYS:AAN"

import requests
result = requests.get(main_url)

from bs4 import BeautifulSoup
soup = BeautifulSoup(result.text, 'html.parser')

print(soup.find("div", class_ = "d-dquote-symbol").prettify())

I expect to see the <div> that contains the <span> with the correct stock value. However, the print returns "none" because the script cannot find this tag. I know it exists because I used inspect element to find the tag in the first place.

like image 750
Jay Avatar asked Nov 24 '25 08:11

Jay


1 Answers

This happens because the page you're scraping is not static.

you can see that it has a "spinner" before displaying the values, or by inspecting the network tab in your browser's debug tools.

requests.get doesn't make any "follow-up" requests so you only get the empty page.

to get the stock value (by HTML scraping...) you should use the request the site itself uses to get the stock value.

NOTE: it is better to look for an official API to get this kind of structured data.

like image 113
Adam.Er8 Avatar answered Nov 25 '25 23:11

Adam.Er8