Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing PDF pages as javascript Images

As per title, is there any way I can parse pages from an unprotected PDF file as javascript Image() objects?

It would also be ok to convert them before running the javascript, but I would like this to be done automatically and without the assistance of any library which requires installation.

Someone over the internet has posted this Bash script. Unfortunately, I don't know Bash but running it was very simple.

#!/bin/bash
PDF='doc.pdf'
NUMPAGES=`identify -format %n "$PDF"`

for (( IDX=0; IDX<$NUMPAGES; IDX++ ))
do
  PAGE=$(($IDX+1))
  convert -resize 1200x900 "$PDF[$IDX]" `echo "$PDF" | sed "s/\.pdf$/-page$PAGE.jpg/"`
done

echo "Done"

But I got these errors:

line 3: identify: command not found
line 5: ((: IDX<: syntax error: operand expected (error token is "<")

Pre-converting the PDF using a Bash script would be a good solution. Can someone fix the script above or either provide an alternative solution?

Many thanks in advance!

like image 932
Saturnix Avatar asked Oct 16 '12 18:10

Saturnix


1 Answers

PDF.js will let you render the PDF to a canvas. Then you can do something like:

var img = new Image();
img.src = pdfCanvas.toDataURL();

I've been very impressed with PDF.js. I love letting the client's browser do as much of the work for me as possible.

Demo here: http://jsbin.com/pdfjs-helloworld-v2/1/edit

like image 168
Trevor Dixon Avatar answered Oct 05 '22 21:10

Trevor Dixon