I am scraping some HTML source from a web page to extract data stored in a json format
This is the Code:
url = 'https://finance.yahoo.com/quote/SPY'
result = requests.get(url)
c = result.content
html = BeautifulSoup(c, 'html.parser')
scripts = html.find_all('script')
sl =[]
for s in scripts:
sl.append(s)
s = (sl[-3])
s = s.contents
s = str(s)
s = s[119:-16]
json_data = json.loads(s)
Running the above throws this error:
json.decoder.JSONDecodError: Expecting ',' delimiter: line 1 column 7506 (char7505)
When I take the content of variable s and pass it to a json formatter it's recognized as proper json.
I used the following web site to check the json: http://jsonprettyprint.com/json-pretty-printer.php
Why is this error coming up when using json.loads() in Python? I am assuming it has something to do with the string not being encoded properly or the presence of escape characters?
How do I solve this?
Your JSON contains certain unexpected tokens like true. Use json.dumps first to resolve it.
print (json.dumps(s,indent =2))
s = json.dumps(s)
json_data = json.loads(s)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With