Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Downloaded .pdf files are corrupted when using expressjs

I am working on meanjs application generated using https://github.com/DaftMonk/generator-angular-fullstack. I am trying to generate a .pdf file using phantomjs and download it to the browser.

The issue is that the downloaded .pdf file always shows the blank pages regardless of the number of pages. The original file on server is not corrupt. When I investigated further, found that the downloaded file is always much larger than the original file on the disk. Also this issue happens only with .pdf files. Other file types are working fine.

I've tried several methods like res.redirect('http://localhost:9000/assets/exports/receipt.pdf');, res.download('client\\assets\\exports\\receipt.pdf'),

var fileSystem = require('fs');
var stat = fileSystem.statSync('client\\assets\\exports\\receipt.pdf');
res.writeHead(200, {
                    'Content-Type': 'application/pdf',
                    'Content-Length': stat.size
                });

var readStream = fileSystem.createReadStream('client\\assets\\exports\\receipt.pdf');
return readStream.pipe(res);

and even I've tried with https://github.com/expressjs/serve-static with no changes in the result.

I am new to nodejs. What is the best way to download a .pdf file to the browser?

Update: I am running this on a Windows 8.1 64bit Computer

like image 703
Libin TK Avatar asked Apr 10 '15 13:04

Libin TK


People also ask

Why downloaded PDF is corrupted?

PDF files can get corrupted for a variety of reasons. The file may have not downloaded properly, a hard drive may have reached its storage capacity, or there could have been a problem transferring from one device to another. If the issue is from a download, try downloading the file again to see if this fixes the issue.

How do you stop a PDF from corrupting?

One of the simplest methods of protecting the file from being corrupted is that when users receive the PDF file as an attachment, they can first download it before viewing it. Most of the time, users don't bother to download the file and view it directly.


4 Answers

I had corruption when serving static pdfs too. I tried everything suggested above. Then I found this: https://github.com/intesso/connect-livereload/issues/39 In essence the usually excellent connect-livereload (package ~0.4.0) was corrupting the pdf. So just get it to ignore pdfs via:

app.use(require('connect-livereload')({ignore: ['.pdf']}));

now this works:

app.use('/pdf', express.static(path.join(config.root, 'content/files')));

...great relief.

like image 137
Paul Muston Avatar answered Oct 15 '22 13:10

Paul Muston


Here is a clean way to serve a file from express, and uses an attachment header to make sure the file is downloaded :

var path = require('path');
var mime = require('mime');

app.get('/download', function(req, res){
  //Here do whatever you need to get your file
  var filename = path.basename(file);
  var mimetype = mime.lookup(file);

  res.setHeader('Content-disposition', 'attachment; filename=' + filename);
  res.setHeader('Content-type', mimetype);

  var filestream = fs.createReadStream(file);
  filestream.pipe(res);
});
like image 24
Tristan Foureur Avatar answered Oct 15 '22 15:10

Tristan Foureur


There are a couple of ways to do this:

  1. If the file is a static one like brochure, readme etc, then you can tell express that my folder has static files (and should be available directly) and keep the file there. This is done using static middleware: app.use(express.static(pathtofile)); Here is the link: http://expressjs.com/starter/static-files.html

Now you can directly open the file using the url from the browser like:

window.open('http://localhost:9000/assets/exports/receipt.pdf');

or

res.redirect('http://localhost:9000/assets/exports/receipt.pdf'); 

should be working.

  1. Second way is to read the file, the data must be coming as a buffer. Actually, it should be recognised if you send it directly, but you can try converting it to base64 encoding using:

    var base64String = buf.toString('base64');

then set the content type :

res.writeHead(200, {
                    'Content-Type': 'application/pdf',
                    'Content-Length': stat.size
                });

and send the data as response. I will try to put an example of this.

EDIT: You dont even need to encode it. You may try that still. But I was able to make it work without even encoding it.

Plus you also do not need to set the headers. Express does it for you. Following is the Snippet of API code written to get the pdf in case it is not public/static. You need API to serve the pdf:

router.get('/viz.pdf', function(req, res){
    require('fs').readFile('viz.pdf', function(err, data){
        res.send(data);
    })
});

Lastly, note that the url for getting the pdf has extension pdf to it, this is for browser to recognise that the incoming file is pdf. Otherwise it will save the file without any extension.

like image 32
Kop4lyf Avatar answered Oct 15 '22 13:10

Kop4lyf


Usually if you are using phantom to generate a pdf then the file will be written to disc and you have to supply the path and a callback to the render function.

router.get('/pdf', function(req, res){
    // phantom initialization and generation logic
    // supposing you have the generation code above 
    page.render(filePath, function (err) {
        var filename = 'myFile.pdf';
        res.setHeader('Content-type', "application/pdf");
        fs.readFile(filePath, function (err, data) {
            // if the file was readed to buffer without errors you can delete it to save space
            if (err) throw err;
            fs.unlink(filePath);

            // send the file contents
            res.send(data);
        });
    });
});
like image 40
AndreiC Avatar answered Oct 15 '22 14:10

AndreiC