Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting docx/odt to PDF using JavaScript

I have a node web app that needs to convert a docx file into pdf (using client side resources only and no plugins). I've found a possible solution by converting my docx into HTML using docxjs and then HTML to PDF using jspdf (docx->HTML->PDF). This solution could make it but I encountered several issues especially with rendering. I know that docxjs doesn't keep the same rendering in HTML as the docx file so it is a problem...

So my question is do you know any free module/solution that could directly do the job without going through HTML (I'm open to odt as a source as well)? If not, what would you advise me to do?

Thanks

like image 245
ncohen Avatar asked May 11 '14 14:05

ncohen


2 Answers

As you already know there is no ready-to-use and open libs for this.. You just can't get good results with available variants. My suggesition is:

  1. Use third party API. Like https://market.mashape.com/convertapi/word2pdf-1#!documentation
  2. Create your own service for this purpose. If you have such ability, I suggest to create a small server on node.js (I bet you know how to do this). You can use Libreoffice as a good converter with good render quality like this:

    libreoffice -headless -invisible -convert-to pdf {$file_name} -outdir /www-disk/

    Don't forget that this is usually takes a lot of time, do not block the request-answer flow: use separate process for each convert operation.

    And the last thing. Libreoffice is not very lightweight but it has good quality. You can also find notable unoconv tool.

As of January 2019, there is docx-wasm, which works in node and performs the conversion locally where node is installed. Proprietary but freemium.

like image 81
zarkone Avatar answered Nov 08 '22 09:11

zarkone


It appears that even after three years ncohen had not found an answer. It was also unclear if it had to be a free (as in dollars) solution.

The original requirements were:

using client side resources only and no plugins

Do you mean you don't want server side conversion? Right, I would like my app to be totally autonomous.

Since all the other answers/comments only offered server side component solutions, which the author clearly stated was not what they wanted, here is a proposed answer.

The company I work for has had this solution for a few years now, that can convert DOCX (not odt yet) files to PDF completely in the browser, with no server side component required. This currently uses either asm.js/PNaCl/WASM depending on the exact browser being used.

https://www.pdftron.com/samples/web/samples/viewing/viewing/

Open an office file using the demo above, and you will see no server communication. Everything is done client side. This demo works on mobile browsers also.

like image 2
Ryan Avatar answered Nov 08 '22 10:11

Ryan