Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert PDF to HTML [closed]

What is the best solution to convert PDF documents to be viewed in the browser as HTML? The site has several PDF documents and the visitor can click on view as HTML and this should be viewed on the screen as an HTML file.

Standard website running PHP, Linux, Apache.

like image 566
ToughPal Avatar asked Jun 05 '09 15:06

ToughPal


3 Answers

pdftohtml works fine : fast, stable but the html result is ugly at best. I have used it for quite some time for a web site that has many job resumes.

It is a good solution for extracting textual content however.

I would give the scribd API a try

or the google apps document API. GOogle does a great job a displaying and converting pdf files

like image 179
Alexis Perrier Avatar answered Sep 30 '22 11:09

Alexis Perrier


Have you considered keeping the PDF data in a database and then either dynamically creating the PDF or the html page depending on what the visitors select?

like image 21
Ian Jacobs Avatar answered Sep 30 '22 11:09

Ian Jacobs


If you have command line access at your hosting provider, there is a utility called pdftohtml inside of the poppler_utils package.

http://poppler.freedesktop.org/

Looks quite easy to use, have not called it from inside of PHP, but it should work.

like image 37
Kevin K Avatar answered Sep 30 '22 09:09

Kevin K