I needed to parse a PDF file to images using PHP. I've done it with the help of Ghostscript. Here is the script:
$result = exec("gs -sDEVICE=png16m -sOutputFile=page-%03d.png $pdfname.pdf");
But almost all of the final images have white borders around them (PDF pages don't have those borders). How can get rid of them? Maybe there are some methods in Ghostcript code that I couldn't find and which would help.
Here is a image -> http://www.pictureshack.ru/images/88046_page-009.png
Here is a printscreen from a PDF file -> http://www.pictureshack.ru/images/62869_pdf.PNG
I would suggest that your pages have a CropBox defined which is smaller than the MediaBox. You can tell Ghostscript to use teh CropBox by supplying the -dUseCropBox switch on the command line.
Of course, as Kurt has said, its not really possible to tell without seeing the original file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With