When I'm browsing a website A using normal browser (Chrome) and when I click on a link on the website A, Chrome imediatelly downloads report in a form of CSV file.
When I checked a server response headers I get the following results:
Cache-Control:private,max-age=31536000
Connection:Keep-Alive
Content-Disposition:attachment; filename="report.csv"
Content-Encoding:gzip
Content-Language:de-DE
Content-Type:text/csv; charset=UTF-8
Date:Wed, 22 Jul 2015 12:44:30 GMT
Expires:Thu, 21 Jul 2016 12:44:30 GMT
Keep-Alive:timeout=15, max=75
Pragma:cache
Server:Apache
Transfer-Encoding:chunked
Vary:Accept-Encoding
Now, I want to download and parse this file using PhantomJS. I set page
onResourceReceived
listener to see if Phantom will receive/download the file.
clientRequests.phantomPage.onResourceReceived = function(response) {
console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
};
When I make Phantom request to download a file (this is page.open('URL OF THE FILE')), I can see in Phantom log that file is downloaded. Here are logs:
"contentType": "text/csv; charset=UTF-8",
"headers": {
"name": "Date",
"value": "Wed, 22 Jul 2015 12:57:41 GMT"
},
"name": "Content-Disposition",
"value": "attachment; filename=\"report.csv\"",
"status":200,"statusText":"OK"
I received the file and its content, but how to access file data? When I print current PhantomJS page
object, I get the HTML of the page A and I don't want that, I want CSV file, which I need to parse using JavaScript.
I found a solution for PhantomJS. Reading through this discussion I found a jsfiddle which downloads a url via jQuery's ajax method and encodes the file as base64.
The file I wanted to download was plain text (CSV) so I have removed the encoding functions. My target page also already had jQuery included so I didn't need to inject jQuery into the target page.
My code assumes you have already opened the page you want to download the file from using PhantomJS, and that page has jQuery in it. In my case I had to first login to the site in order to get the download link.
var fs = require('fs');
var page=this;
var result = page.evaluate(function() {
var out;
$.ajax({
'async' : false,
'url' : 'fullurltodownload.csv',
'success' : function(data, status, xhr) {
out = data;
}
});
return out;
});
fs.write('mydownloadedfile.csv', result);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With