I am using Python to scrape AAPL's stock price from Yahoo finance. But the program always returns []
. I would appreciate if someone could point out why the program is not working. Here is my code:
import urllib
import re
htmlfile=urllib.urlopen("https://ca.finance.yahoo.com/q?s=AAPL&ql=0")
htmltext=htmlfile.read()
regex='<span id=\"yfs_l84_aapl\" class="">(.+?)</span>'
pattern=re.compile(regex)
price=re.findall(pattern,htmltext)
print price
The original source is like this:
<span id="yfs_l84_aapl" class>112.31</span>
Here I just want the price 112.31. I copy and paste the code and find 'class' changes to 'class=""'. I also tried code
regex='<span id=\"yfs_l84_aapl\" class="">(.+?)</span>'
But it does not work either.
Well, the good news is that you are getting the data. You were nearly there. I would recommend that you work our your regex problems in a tool that helps, e.g. regex101.
Anyway, here is your working regex:
regex='<span id="yfs_l84_aapl">(\d*\.\d\d)'
You are collecting only digits, so don't do the general catch, be specific where you can. This is multiple digits, with a decimal literal, with two more digits.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With