I've looked around the internet for a solution to this but none have really seemed applicable here. I'm writing a Python program to predict the next day's stock price using historical data. I don't need all the historical data since inception as Yahoo finance provides but only the last 60 days or so. The NASDAQ website provides just the right amount of historical data and I wanted to use that website.
What I want to do is, go to a particular stock's profile on NASDAQ. For Example: (www.nasdaq.com/symbol/amd/historical) and click on the "Download this File in Excel Format" link at the very bottom. I inspected the page's HTML to see if there was an actual link I can just use with urllib to get the file but all I got was:
<a id="lnkDownLoad" href="javascript:getQuotes(true);">
Download this file in Excel Format
</a>
No link. So my question is,how can I write a Python script that goes to a given stock's NASDAQ page, click on the Download file in excel format link and actually download the file from it. Most solutions online require you to know the url where the file is stored but in this case, I don't have access to that. So how do I go about doing this?
Get all links from a webpage download webpage data (html) create beautifulsoup object and parse webpage data. use soups method findAll to find all links by the a tag. store all links in list.
View > Developer > Developer Tools
Network
tabPlease be aware that this may be against the website's Terms of Service!
It appears that BeautifulSoup might be the easiest way to do this. I've made a cursory check that the results of the following script are the same as those that appear on the page. You would just have to write the results to a file, rather than print them. However, the columns are ordered differently.
import requests
from bs4 import BeautifulSoup
URL = 'http://www.nasdaq.com/symbol/amd/historical'
page = requests.get(URL).text
soup = BeautifulSoup(page, 'lxml')
tableDiv = soup.find_all('div', id="historicalContainer")
tableRows = tableDiv[0].findAll('tr')
for tableRow in tableRows[2:]:
row = tuple(tableRow.getText().split())
print ('"%s",%s,%s,%s,%s,"%s"' % row)
Output:
"03/24/2017",14.16,14.18,13.54,13.7,"50,022,400"
"03/23/2017",13.96,14.115,13.77,13.79,"44,402,540"
"03/22/2017",13.7,14.145,13.55,14.1,"61,120,500"
"03/21/2017",14.4,14.49,13.78,13.82,"72,373,080"
"03/20/2017",13.68,14.5,13.54,14.4,"91,009,110"
"03/17/2017",13.62,13.74,13.36,13.49,"224,761,700"
"03/16/2017",13.79,13.88,13.65,13.65,"44,356,700"
"03/15/2017",14.03,14.06,13.62,13.98,"55,070,770"
"03/14/2017",14,14.15,13.6401,14.1,"52,355,490"
"03/13/2017",14.475,14.68,14.18,14.28,"72,917,550"
"03/10/2017",13.5,13.93,13.45,13.91,"62,426,240"
"03/09/2017",13.45,13.45,13.11,13.33,"45,122,590"
"03/08/2017",13.25,13.55,13.1,13.22,"71,231,410"
"03/07/2017",13.07,13.37,12.79,13.05,"76,518,390"
"03/06/2017",13,13.34,12.38,13.04,"117,044,000"
"03/03/2017",13.55,13.58,12.79,13.03,"163,489,100"
"03/02/2017",14.59,14.78,13.87,13.9,"103,970,100"
"03/01/2017",15.08,15.09,14.52,14.96,"73,311,380"
"02/28/2017",15.45,15.55,14.35,14.46,"141,638,700"
"02/27/2017",14.27,15.35,14.27,15.2,"95,126,330"
"02/24/2017",14,14.32,13.86,14.12,"46,130,900"
"02/23/2017",14.2,14.45,13.82,14.32,"79,900,450"
"02/22/2017",14.3,14.5,14.04,14.28,"71,394,390"
"02/21/2017",13.41,14.1,13.4,14,"66,250,920"
"02/17/2017",12.79,13.14,12.6,13.13,"40,831,730"
"02/16/2017",13.25,13.35,12.84,12.97,"52,403,840"
"02/15/2017",13.2,13.44,13.15,13.3,"33,655,580"
"02/14/2017",13.43,13.49,13.19,13.26,"40,436,710"
"02/13/2017",13.7,13.95,13.38,13.49,"57,231,080"
"02/10/2017",13.86,13.86,13.25,13.58,"54,522,240"
"02/09/2017",13.78,13.89,13.4,13.42,"72,826,820"
"02/08/2017",13.21,13.75,13.08,13.56,"75,894,880"
"02/07/2017",14.05,14.27,13.06,13.29,"158,507,200"
"02/06/2017",12.46,13.7,12.38,13.63,"139,921,700"
"02/03/2017",12.37,12.5,12.04,12.24,"59,981,710"
"02/02/2017",11.98,12.66,11.95,12.28,"116,246,800"
"02/01/2017",10.9,12.14,10.81,12.06,"165,784,500"
"01/31/2017",10.6,10.67,10.22,10.37,"51,993,490"
"01/30/2017",10.62,10.68,10.3,10.61,"37,648,430"
"01/27/2017",10.6,10.73,10.52,10.67,"32,563,480"
"01/26/2017",10.35,10.66,10.3,10.52,"35,779,140"
"01/25/2017",10.74,10.975,10.15,10.35,"61,800,440"
"01/24/2017",9.95,10.49,9.95,10.44,"43,858,900"
"01/23/2017",9.68,10.06,9.68,9.91,"27,848,180"
"01/20/2017",9.88,9.96,9.67,9.75,"27,936,610"
"01/19/2017",9.92,10.25,9.75,9.77,"46,087,250"
"01/18/2017",9.54,10.1,9.42,9.88,"51,705,580"
"01/17/2017",10.17,10.23,9.78,9.82,"70,388,000"
"01/13/2017",10.79,10.87,10.56,10.58,"38,344,340"
"01/12/2017",10.98,11.0376,10.33,10.76,"75,178,900"
"01/11/2017",11.39,11.41,11.15,11.2,"39,337,330"
"01/10/2017",11.55,11.63,11.33,11.44,"29,122,540"
"01/09/2017",11.37,11.64,11.31,11.49,"37,215,840"
"01/06/2017",11.29,11.49,11.11,11.32,"34,437,560"
"01/05/2017",11.43,11.69,11.23,11.24,"38,777,380"
"01/04/2017",11.45,11.5204,11.235,11.43,"40,742,680"
"01/03/2017",11.42,11.65,11.02,11.43,"55,114,820"
"12/30/2016",11.7,11.78,11.25,11.34,"44,033,460"
"12/29/2016",11.24,11.62,11.01,11.59,"50,180,310"
"12/28/2016",12.28,12.42,11.46,11.55,"71,072,640"
"12/27/2016",11.65,12.08,11.6,12.07,"44,168,130"
The script escapes dates and thousands-separated numbers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With