Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create a docx (Word) document by using Perl (module)

Tags:

perl

docx

I have been looking for some time now, and I decided to try some crowd sourcing.

I have searched (Googled) the answer and looked through Stack Overflow for some time now, and I cannot find a proper and relatively easy way of created DOCX documents via Perl.

I want to create a DOC file, and since DOCX is XML based, I was guessing that would be an easier way to achieve this.

I located a RTF::Writer module but its very limited in its capabilities.

There are more than one such library for PHP, and other languages, but I cannot use that, unfortunately.

I am not running on a Windows environment so I cannot use anything that would integrate with Office, in addition I don't want to start bundling Office with my product.

I am open to suggestions, but please provide sensible ones :) i.e. no, you are scr*wed DOCX is impossible.

Here is what I tried: 1) Take an existing DOCX, and modify the XML directly, all I achieved via this is caused Word to crash :) apparently Word is very sensitive on its attribute order

2) Googled for answers and I found some, like Win32::Word::Writer which only works on Windows and requires OLE and Office

3) Found a lot of posts from 2010, that say its impossible, well almost 4 years have passed, probably something is out there that can do it

4) Looked for commercial solutions, couldn't find one, I found FOP which is able to create RTF, which is pretty close, but it lacks a lot of the styling I would like to use

5) A lot of things (code and modules) that allow extracting data from DOCX, but nothing that can create one, weird

6) Found abandoned code like OpenOffice::OODoc which stopped being written in 2010, and of course requires OpenOffice to be installed, and potentially also requires a non-headless (i.e. requires a GUI system)

Thanks guys for any answers :}

like image 837
Noam Rathaus Avatar asked Oct 03 '22 02:10

Noam Rathaus


2 Answers

One cheat that I've used in the past is to output HTML with a ".doc" file name.

This gives you less fine-grained control over the document formatting, but may be sufficient for your use case.

like image 104
tobyink Avatar answered Oct 11 '22 16:10

tobyink


The closest I've ever managed is to generate an OpenOffice document and then use that to export as .docx (in headless mode).

You need some fonts installed, but no GUI for this. I use OpenOffice::OODoc, and it's enough to let me open up an existing document and add text/pictures.

The OpenOffice (LibreOffice) export process is not 100% reliable, but I've never been able to get a simple, repeatable test case to reproduce it - just hangs occasionally. I add a timer to kill the process and let it retry.

Not a perfect situation, I'm afraid and I hope someone has a better solution.

like image 34
Richard Huxton Avatar answered Oct 11 '22 18:10

Richard Huxton