Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Mozilla PDF - how to view PDFs from url in react app?

I have followed a quick tutorial on how to implement Mozilla's PDF viewer with React. I have made a codesanbox here. I would like to know if this is possible to implement with importing node module of pdfjs. So, instead of downloading the package in to public folder to use it with import:

export default class PDFJs {
  init = (source, element) => {
    const iframe = document.createElement("iframe");

    iframe.src = `/pdfjs-2.5.207-dist/web/viewer.html?file=${source}`;
    iframe.width = "100%";
    iframe.height = "100%";

    element.appendChild(iframe);
  };
}

Also, this kind of setup doesn't work when PDF's source is an URL. If I do that I get an error:

PDF.js v2.5.207 (build: 0974d6052) Message: file origin does not match viewer's

I have commented out the part of the code where it checks the file's origin in pdfjs-2.5.207-dist/web/viewer.js:

  //if (origin !== viewerOrigin && protocol !== "blob:") {
  //  throw new Error("file origin does not match viewer's");
  //} 

But, then I got an error:

PDF.js v2.5.207 (build: 0974d6052) Message: Failed to fetch

How can I fix this? Is it possible to import this package like a module into react component and how can I use it for PDF's from external resources with URL?

like image 379
Ludwig Avatar asked Jan 13 '17 09:01

Ludwig


2 Answers

Here is a working codesandbox with Mozilla's viewer and your pdf.

Things to note :

  1. Your pdf must be served over HTTPS, otherwise you get this error :

Mixed Content: The page at 'https://codesandbox.io/' was loaded over HTTPS, but requested an insecure resource 'http://www.africau.edu/images/default/sample.pdf'. This request has been blocked; the content must be served over HTTPS.

  1. The server hosting the pdf should allow your app domain using Access-Control-Allow-Origin, or be in the same origin, otherwise you get this error :

Access to fetch at 'https://www.adobe.com/support/products/enterprise/knowledgecenter/media/c4611_sample_explain.pdf' from origin 'https://lchyv.csb.app' has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present on the requested resource. If an opaque response serves your needs, set the request's mode to 'no-cors' to fetch the resource with CORS disabled.

  1. For the demo purpose, I used https://cors-anywhere.herokuapp.com/<URL_TO_PDF>, which sets Access-Control-Allow-Origin: * for you, but should not be used in production!

So in conclusion, your pdf didn't load because of the browser's restrictions. Importing pdfjs directly in your app, and building a viewer from scratch (which is a lot of work), won't solve those problems.

like image 85
Mohamed Ramrami Avatar answered Oct 20 '22 01:10

Mohamed Ramrami


Referrer Policy: strict-origin-when-cross-origin / Usage with external sources

The pdf should be located on the same host (including same protocol). Hosting the pdf on the same url as your app/website, should solve this problem.

Allowing a pdf to be loaded in other pages can lead to various security risks.

If you want to show an up-to-date version of an external pdf on your own homepage, there are basically two options.

Hosting PDF on your server

Running a server script (cron) which downloads the pdf and hosts it on your own server.

Allow cross-origin

If you have access to the server hosting the pdf you can send headers to allow cross-origin.

Access-Control-Allow-Origin: *

How to use pdfjs with yarn/npm

Documentation on this is really bad, but they have a repository pdfjs-dist and some related docs.

Installation

npm install pdfjs-dist

Usage (from DOC)

import * as pdfjsLib from 'pdfjs-dist';
var url = 'https://raw.githubusercontent.com/mozilla/pdf.js/ba2edeae/examples/learning/helloworld.pdf';

// The workerSrc property shall be specified.
pdfjsLib.GlobalWorkerOptions.workerSrc = '//mozilla.github.io/pdf.js/build/pdf.worker.js';

// Asynchronous download of PDF
var loadingTask = pdfjsLib.getDocument(url);
loadingTask.promise.then(function(pdf) {
  console.log('PDF loaded');
  
  // Fetch the first page
  var pageNumber = 1;
  pdf.getPage(pageNumber).then(function(page) {
    console.log('Page loaded');
    
    var scale = 1.5;
    var viewport = page.getViewport({scale: scale});

    // Prepare canvas using PDF page dimensions
    var canvas = document.getElementById('the-canvas');
    var context = canvas.getContext('2d');
    canvas.height = viewport.height;
    canvas.width = viewport.width;

    // Render PDF page into canvas context
    var renderContext = {
      canvasContext: context,
      viewport: viewport
    };
    var renderTask = page.render(renderContext);
    renderTask.promise.then(function () {
      console.log('Page rendered');
    });
  });
}, function (reason) {
  // PDF loading error
  console.error(reason);
});

Service Worker

You do need the service worker - pdfjs does not work without it, so neither does reactpdf.

If you use CRA, and do not want to use CDN, you can perform following steps:

1) Copy worker to public folder

cp ./node_modules/pdfjs-dist/build/pdf.worker.js public/scripts

2) Register Service Worker

pdfjsLib.GlobalWorkerOptions.workerSrc = `${process.env.PUBLIC_URL}/scripts/pdf.worker.js`
like image 41
oshell Avatar answered Oct 20 '22 01:10

oshell