I'm trying to convert programatically PDF to HTML. So far I've been using pdftohtml but our users are not happy with the results.
Here's what I need :
I'm using Ruby on Rails, but any tool working on Unix would work as I can call it from the command line. But of course a nice gem or plugin would be perfect.
I'd prefer it to be open source
It needs to be able handle images
It would be nice if there was an option to discard images if needed
It needs to be stable
It needs to return html with a layout close to the original pdf (I've tried pdftohtml and the result is not that good in a lot of cases)
How to convert a PDF into HTML. The quickest way to convert your PDF is to open it in Acrobat. Go to the File menu, navigate down to Export To, and select HTML Web Page. Your PDF will automatically convert and open in your default web browser.
Here are a couple more alternatives to pdftohtml/xpdf:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With