Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scraping a table on Barchart website using python

Tags:

python

Scraping an AJAX web page using python and requests

I used the script in above link to get a table on Barchart webite and it somehow stopped working recently with the error message {'error': {'message': 'The payload is invalid.', 'code': 400}}. I guess some of the filed names have been changed but I am pretty new to web scanning and I couldn't figure out how to fix it. Any suggestions?

import requests

geturl=r'https://www.barchart.com/futures/quotes/CLJ19/all-futures'
apiurl=r'https://www.barchart.com/proxies/core-api/v1/quotes/get'


getheaders={

    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.9',
    'cache-control': 'max-age=0',
    'upgrade-insecure-requests': '1',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36'
    }

getpay={
    'page': 'all'
}

s=requests.Session()
r=s.get(geturl,params=getpay, headers=getheaders)



headers={
    'accept': 'application/json',
    'accept-encoding': 'gzip, deflate, br',
    'accept-language': 'en-US,en;q=0.9',
    'referer': 'https://www.barchart.com/futures/quotes/CLJ19/all-futures?page=all',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.119 Safari/537.36',
    'x-xsrf-token': s.cookies.get_dict()['XSRF-TOKEN']

}
payload={
    'fields': 'symbol,contractSymbol,lastPrice,priceChange,openPrice,highPrice,lowPrice,previousPrice,volume,openInterest,tradeTime,symbolCode,symbolType,hasOptions',
    'list': 'futures.contractInRoot',
    'root': 'CL',
    'meta': 'field.shortName,field.type,field.description',
    'hasOptions': 'true',
    'raw': '1'

}


r=s.get(apiurl,params=payload,headers=headers)
j=r.json()
print(j)

OUT: {'error': {'message': 'The payload is invalid.', 'code': 400}}

like image 548
Tony Tang Avatar asked Oct 14 '25 04:10

Tony Tang


1 Answers

This happened with me too. This is because the website gets the table from an internal API and the cookies should be decoded to avoid this error.

Try this solution:

1- import the unquote function at the beginning of your code

from urllib.parse import unquote

2- Change this line:

'x-xsrf-token': s.cookies.get_dict()['XSRF-TOKEN']

to this:

'x-xsrf-token': unquote(unquote(s.cookies.get_dict()['XSRF-TOKEN']))
like image 196
Ahmed Sabry Avatar answered Oct 16 '25 18:10

Ahmed Sabry