I need to pre-produce a million or two PDF files from a simple template (a few pages and tables) with embedded fonts. Usually, I would stay low level in a case like this, and compose everything with a library like ReportLab, but I joined late in the project.
Currently, I have a template.odt and use markers in the content.xml files to fill with data from a DB. I can smoothly create the ODT files, they always look rigth.
For the ODT to PDF conversion, I'm using openoffice in server mode (and PyODConverter w/ named pipe), but it's not very reliable: in a batch of documents, there is eventually a point after which all the processed files are converted into garbage (wrong fonts and letters sprawled all over the page).
Problem is not predictably reproducible (does not depend on the data), happens in OOo 2.3 and 3.2, in Ubuntu, XP, Server 2003 and Windows 7. My Heisenbug detector is ticking.
I tried to reduce the size of batches and restarting OOo after each one; still, a small percentage of the documents are messed up.
Of course I'll write about this on the Ooo mailing lists, but in the meanwhile, I have a delivery and lost too much time already.
Where do I go?
Completely avoid the ODT format and go for another template system.
Keep the format but go for another tool/program for the conversion.
Converting to an intermediate .DOC format could help to avoid the OOo bug, but it would double the processing time and complicate a task that is already too hairy.
Try to produce the PDFs twice and compare them, discarding the whole batch if there's something wrong.
Restart OOo after processing each document.
Go for ReportLab and recreate the pages programmatically. This is the approach I'm going to try in a few minutes.
Learn to properly format bulleted lists
Thanks a lot.
Edit: it seems like I cannot use ReportLab at all, it won't let me embed the font. My font comes in TrueType and OpenType versions.
The TrueType one says "TTFError: Font does not allow subsetting/embedding (0100) ".
The OpenType version says "TTFError[...] postscript outlines are not supported".
Very very funny.
Simply convert an ODT file into a PDF with one click. To do this, select the “Export directly as PDF” button in the toolbar. A window will open automatically where you can specify the file name and location. OpenOffice Writer lets you export a PDF in a single click.
Go to File > Save As > Office Open XML Document. Rename your file if needed then click Save to continue. This will create a . DOC version of the original file.
For creating such large amount of PDF files OpenOffice seems me the wrong product. You should use a real reporting solution which is optimized for creating large amount of PDF files. There many different tools. I would recommended i-net Clear Reports (used to be called i-net Crystal-Clear).
The disadvantages is that you must restart your development.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With