Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to download and save a PDF file, which is received as attachment in response header in PhantomJS?

Tags:

I'm trying to download some PDF file using PhantomJS. There is no direct URL for downloading that PDF, as it calls some internal JavaScript function, when I click the submit button.

Here is the code that I am using to download PDF file:

 page.open(url, function(status){      page.evaluate(function(){          document.getElementById('id').click();      });  });  page.onResourceReceived = function(request){      console.log('Received ' + JSON.stringify(request, undefined, 4));  }; 

The 'id' is the element id for submit button. The problem here is that even though I am getting the response (inside onResourceReceived callback) as JSON format, but I'm not able to save the attachment as some PDF file.

When I run the above code, I get following output as JSON string:

 Received {     "contentType": "application/pdf",     "headers": [         // Some other headers.         {             "name": "Content-Type",             "value": "application/pdf"         },         {             "name": "content-disposition",             "value": "attachment; filename=FILENAME.PDF"         },     ],     "id": 50,     "redirectURL": null,     "stage": "end",     "status": 200,     "statusText": "OK",     "url": "http://www.someurl.com" } 

Please, suggest solutions using PhantomJS only. Thank you!

like image 476
Ishank Jain Avatar asked Jul 22 '15 14:07

Ishank Jain


People also ask

How do I make a PDF link downloadable in HTML?

With the use of the <a> tag download attribute, we can download pdf files, images, word files, etc. The download attribute specifies that the target (the file specified in the href attribute) will be downloaded when a user clicks on the hyperlink.

How do I automatically download a PDF link?

Scroll to the Privacy & Security settings and click 'Site Settings'. What is this? On the Site Settings page, click 'PDF documents'. On the page that follows, turn on the 'Download PDF files instead of automatically opening them in Chrome' option.


1 Answers

In general, I would recommend to stop using PhantomJS and have a look on Headless Chrome. Here is a nice article about this topic. I was using https://github.com/puppeteer/puppeteer for this purpose and it was an easily integrated solution.

like image 147
Jakub Kubista Avatar answered Dec 17 '22 11:12

Jakub Kubista