I want to download a CSV file, it is generated on a button click through a POST request. I researched to my best on casperJs and phantomJS forums and returned empty handed. In a normal browser like firefox, a browser download dialog window appears after the post request. How to handle this case in PhantomJS
TTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Server: Microsoft-IIS/7.5
Content-disposition: attachment;filename=ExportData.csv
X-AspNet-Version: 2.0.50727
X-Powered-By: ASP.NET
Date: Fri, 19 Apr 2013 23:26:40 GMT
Content-Length: 65183
Line 33: Return “Good Job” on a successful request. We are now going to send a file attachment to our endpoint. This request will be a POST request sent to localhost:8080/api, which is just our local server. To attach a file, you must include it with the Body as form-data. Once you are in the Body → form-data fields, you must enter a KEY.
The get method of the requests module is the one we will use most frequently – which corresponds to how the majority of the HTTP requests your browser makes involve the GET method. Even without knowing much about HTTP, the concept of GET is about as simple as its name: it will get a resource from a web server.
Construct the HTTP GET request to send to the HTTP server. Send the HTTP request and receive the HTTP Response from the HTTP server. Save the contents of the file from HTTP Response to a local file. Given these points, we can create a Python 3 function that downloads a file from a HTTP server endpoint via HTTP GET:
Why use Python 'requests' package? Data collection is an integral step of every company’s data analysis pipeline. Lucky for data science professionals, there are many ways to obtain useful data – through a company’s internal data collection mechanisms, by taking advantage of APIs or just by downloading a relevant file from the web.
I've found a way to do this using casperjs (it should work with phantomjs alone if you implement the download function using XMLHttpRequest, but i've not tried).
I'll leave you the working example, that tries to download the mos recent PDF from this page. When you click the download link, some javascript code is triggered that generates some hidden input fields that are then POSTed.
What we do is replace the form's onsubmit function so that it cancels the submission, and get the form destination (action) and all its fields. We use this information later to do the actual download.
var casper=require('casper').create();
casper.start("https://sede.gobcan.es/tributos/jsf/publico/notificaciones/comparecencia/ultimosanuncios.jsp", function() {
var theFormRequest = this.page.evaluate(function() {
var request = {};
var formDom = document.forms["resultadoUltimasNotif"];
formDom.onsubmit = function() {
//iterate the form fields
var data = {};
for(var i = 0; i < formDom.elements.length; i++) {
data[formDom.elements[i].name] = formDom.elements[i].value;
}
request.action = formDom.action;
request.data = data;
return false; //Stop form submission
}
//Trigger the click on the link.
var link = $("table.listado tbody tr:first a");
link.click();
return request; //Return the form data to casper
});
//Start the download
casper.download(theFormRequest.action, "downloaded_file.pdf", "POST", theFormRequest.data);
});
casper.run();
Note: you have to run it with --ignore-ssl-errors, as the CA they use isn't in your browser default CA list.
casperjs --ignore-ssl-errors=true downloadscript.js
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With