I try to build a python script who sends a POST with parameters for extracting the result. With fiddler, I have extracted the post request who return that I want. The website uses https only.
POST /Services/GetFromDataBaseVersionned HTTP/1.1 Host: www.mywbsite.fr "Connection": "keep-alive", "Content-Length": 129, "Origin": "https://www.mywbsite.fr", "X-Requested-With": "XMLHttpRequest", "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.52 Safari/536.5", "Content-Type": "application/json", "Accept": "*/*", "Referer": "https://www.mywbsite.fr/data/mult.aspx", "Accept-Encoding": "gzip,deflate,sdch", "Accept-Language": "fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.3", "Cookie": "ASP.NET_SessionId=j1r1b2a2v2w245; GSFV=FirstVisit=; GSRef=https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CHgQFjAA&url=https://www.mywbsite.fr/&ei=FZq_T4abNcak0QWZ0vnWCg&usg=AFQjCNHq90dwj5RiEfr1Pw; HelpRotatorCookie=HelpLayerWasSeen=0; NSC_GSPOUGS!TTM=ffffffff09f4f58455e445a4a423660; GS=Site=frfr; __utma=1.219229010.1337956889.1337956889.1337958824.2; __utmb=1.1.10.1337958824; __utmc=1; __utmz=1.1337956889.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided)" {"isLeftColumn":false,"lID":-1,"userIpCountryCode":"FR","version":null,"languageCode":"fr","siteCode":"frfr","Quotation":"eu"}
And now my python script:
#!/usr/bin/env python # -*- coding: iso-8859-1 -*- import string import httplib import urllib2 host = "www.mywbsite.fr/sport/multiplex.aspx" params='"isLeftColumn":"false","liveID":"-1","userIpCountryCode":"FR","version":"null","languageCode":"fr","siteCode":"frfr","Quotation":"eu"' headers = { Host: www.mywbsite.fr, "Connection": "keep-alive", "Content-Length": 129, "Origin": "https://www.mywbsite.fr", "X-Requested-With": "XMLHttpRequest", "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.52 Safari/536.5", "Content-Type": "application/json", "Accept": "*/*", "Referer": "https://www.mywbsite.fr/data/mult.aspx", "Accept-Encoding": "gzip,deflate,sdch", "Accept-Language": "fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.3", "Cookie": "ASP.NET_SessionId=j1r1b2a2v2w245; GSFV=FirstVisit=; GSRef=https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CHgQFjAA&url=https://www.mywbsite.fr/&ei=FZq_T4abNcak0QWZ0vnWCg&usg=AFQjCNHq90dwj5RiEfr1Pw; HelpRotatorCookie=HelpLayerWasSeen=0; NSC_GSPOUGS!TTM=ffffffff09f4f58455e445a4a423660; GS=Site=frfr; __utma=1.219229010.1337956889.1337956889.1337958824.2; __utmb=1.1.10.1337958824; __utmc=1; __utmz=1.1337956889.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided)" } url = "/Services/GetFromDataBaseVersionned" # POST the request conn = httplib.HTTPConnection(host,port=443) conn.request("POST",url,params,headers) response = conn.getresponse() data = response.read() print data
But when I run my script, I have this error:
socket.gaierror: [Errno -2] Name or service not known
HTTP headers let the client and the server pass additional information with an HTTP request or response. All the headers are case-insensitive, headers fields are separated by colon, key-value pairs in clear-text string format.
You can add header to pandas dataframe using the df. colums = ['Column_Name1', 'column_Name_2'] method. You can use the below code snippet to set column headers to the dataframe.
The request and response between client and server involves header and body in the message. Headers contain protocol specific information that appear at the beginning of the raw message that is sent over TCP connection. The body of the message is separated from headers using a blank line.
Thanks a lot for your link to the requests module. It's just perfect. Below the solution to my problem.
import requests import json url = 'https://www.mywbsite.fr/Services/GetFromDataBaseVersionned' payload = { "Host": "www.mywbsite.fr", "Connection": "keep-alive", "Content-Length": 129, "Origin": "https://www.mywbsite.fr", "X-Requested-With": "XMLHttpRequest", "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/536.5 (KHTML, like Gecko) Chrome/19.0.1084.52 Safari/536.5", "Content-Type": "application/json", "Accept": "*/*", "Referer": "https://www.mywbsite.fr/data/mult.aspx", "Accept-Encoding": "gzip,deflate,sdch", "Accept-Language": "fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4", "Accept-Charset": "ISO-8859-1,utf-8;q=0.7,*;q=0.3", "Cookie": "ASP.NET_SessionId=j1r1b2a2v2w245; GSFV=FirstVisit=; GSRef=https://www.google.fr/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=0CHgQFjAA&url=https://www.mywbsite.fr/&ei=FZq_T4abNcak0QWZ0vnWCg&usg=AFQjCNHq90dwj5RiEfr1Pw; HelpRotatorCookie=HelpLayerWasSeen=0; NSC_GSPOUGS!TTM=ffffffff09f4f58455e445a4a423660; GS=Site=frfr; __utma=1.219229010.1337956889.1337956889.1337958824.2; __utmb=1.1.10.1337958824; __utmc=1; __utmz=1.1337956889.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided)" } # Adding empty header as parameters are being sent in payload headers = {} r = requests.post(url, data=json.dumps(payload), headers=headers) print(r.content)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With