Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get rid of Helvetica in iText XMLWorker?

We're using iText to generate PDF files from Java code, which works pretty well in most cases. A few days ago we started to generate PDF/A instead of normal PDF files which needs to embed all fonts. The iText Document is mostly build of custom PdfPTable and other classes where we control the fonts directly. All used fonts are created from TTF files loaded via the following code - which works just fine:

    private BaseFont load(String path) {
        try {
            URL fontResource = PrintSettings.class.getResource(path);
            if (fontResource == null) {
                return null;
            }
            String fontPath = fontResource.toExternalForm();
            BaseFont baseFont = BaseFont.createFont(fontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
            baseFont.setSubset(true);
            return baseFont;
        }
        catch (DocumentException ex) {
            Logger.getLogger(PrintSettings.class).warn("...");
        }
        catch (IOException ex) {
            Logger.getLogger(PrintSettings.class).warn("...");
        }
        return FontFactory.getFont(PrintSettings.FONT, "UTF-8", true, 8f, Font.NORMAL, PrintSettings.COLOR_TEXT).getBaseFont();
    }

Now we use one specific content type in the PDF which generates from HTML code. We use the XMLWorkerto handle that part. This worked just fine, as long as we didn't embed the fonts. But with PDF/A we need to embed all fonts and now we struggle with an unknown source of Helvetica usage.

We've tried to solve this by using our own FontProvider class like this one:

public class PrintFontProvider extends FontFactoryImp {

    @Override
    public Font getFont(String fontName, String encoding, boolean embedded, float size, int style, BaseColor color, boolean cached) {

        // LiberationSans – http://de.wikipedia.org/wiki/Liberation_(Schriftart) – http://scripts.sil.org/cms/scripts/page.php?item_id=OFL_web
        if (style == Font.NORMAL)     return new Font(this.load("fonts/Liberation/LiberationSans-Regular.ttf"),    size, Font.NORMAL, color);
        if (style == Font.BOLD)       return new Font(this.load("fonts/Liberation/LiberationSans-Bold.ttf"),       size, Font.NORMAL, color);
        if (style == Font.BOLDITALIC) return new Font(this.load("fonts/Liberation/LiberationSans-BoldItalic.ttf"), size, Font.NORMAL, color);
        if (style == Font.ITALIC)     return new Font(this.load("fonts/Liberation/LiberationSans-Italic.ttf"),     size, Font.NORMAL, color);
        return new Font(this.load("fonts/Liberation/LiberationSans-Regular.ttf"), size, style, color);
    }

    private BaseFont load(String path) { ... }
}

It's connected with the XMLWorker using the following code:

HtmlPipelineContext html = new HtmlPipelineContext(null);
html.setTagFactory(Tags.getHtmlTagProcessorFactory());
CSSResolver css = XMLWorkerHelper.getInstance().getDefaultCssResolver(true);

// We need to control the FontProdiver!
html.setCssAppliers(new CssAppliersImpl(new PrintFontProvider()));

Pipeline<?> pipeline = new CssResolverPipeline(css, new HtmlPipeline(html, new PdfWriterPipeline(this.document, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser p = new XMLParser(worker);
p.parse(new ByteArrayInputStream(StringUtils.iTextHTML(string).getBytes()));

Most simple HTML elements work this way... but there are some which seem to ignore the FontProvider and keep using Helvetica which won't be embedded in the PDF/A (we don't have that font). For example <ol><li>...</li></ol> make use of this.

Caused by: com.itextpdf.text.pdf.PdfXConformanceException: All the fonts must be embedded. This one isn't: Helvetica
at com.itextpdf.text.pdf.internal.PdfXConformanceImp.checkPDFXConformance(PdfXConformanceImp.java:225)
at com.itextpdf.text.pdf.PdfWriter.addSimple(PdfWriter.java:2192)
at com.itextpdf.text.pdf.PdfContentByte.setFontAndSize(PdfContentByte.java:1444)
at com.itextpdf.text.pdf.PdfDocument.writeLineToContent(PdfDocument.java:1463)
at com.itextpdf.text.pdf.ColumnText.go(ColumnText.java:968)
at com.itextpdf.text.pdf.ColumnText.go(ColumnText.java:841)
at com.itextpdf.text.pdf.ColumnText.showTextAligned(ColumnText.java:1189)
at com.itextpdf.text.pdf.ColumnText.showTextAligned(ColumnText.java:1208)
at com.itextpdf.text.pdf.PdfDocument.flushLines(PdfDocument.java:1193)
at com.itextpdf.text.pdf.PdfDocument.newPage(PdfDocument.java:830)
at com.itextpdf.text.Document.newPage(Document.java:367)

I've run out of ideas how to get rid of Helvetica for now... trying to solve this for 8+ hours now... any more ideas?

like image 931
Daniel Bleisteiner Avatar asked Nov 04 '22 17:11

Daniel Bleisteiner


1 Answers

I've dug a little deeper and traveled from OrderedUnorderedList over ListItem to List...

/**
 * Adds an <CODE>Element</CODE> to the <CODE>List</CODE>.
 *
 * @param   o       the element to add.
 * @return true if adding the object succeeded
 * @since 5.0.1 (signature changed to use Element)
 */
@Override
public boolean add(final Element o) {
    if (o instanceof ListItem) {
        ListItem item = (ListItem) o;
        if (this.numbered || this.lettered) {
            Chunk chunk = new Chunk(this.preSymbol, this.symbol.getFont());
            chunk.setAttributes(this.symbol.getAttributes());
            int index = this.first + this.list.size();
            if ( this.lettered )
                chunk.append(RomanAlphabetFactory.getString(index, this.lowercase));
            else
                chunk.append(String.valueOf(index));
            chunk.append(this.postSymbol);
            item.setListSymbol(chunk);
        }
        else {
            item.setListSymbol(this.symbol);
        }
        item.setIndentationLeft(this.symbolIndent, this.autoindent);
        item.setIndentationRight(0);
        return this.list.add(item);
    }
    else if (o instanceof List) {
        List nested = (List) o;
        nested.setIndentationLeft(nested.getIndentationLeft() + this.symbolIndent);
        this.first--;
        return this.list.add(nested);
    }
    return false;
}

This code refers to this.symbol.getFont() which is set to undefined on class initialization...

public class List implements TextElementArray, Indentable {

    [...]    

    /** This is the listsymbol of a list that is not numbered. */
    protected Chunk symbol = new Chunk("- ");

I simply used another Chunk constructor which takes a Font of mine and voila... SOLVED. The numbered list no longer uses Helvetica but my own font which gets embedded properly.

This took me ages! Another way might have been to implement an own TagProcessor for <ol> but we don't have the time for this anymore. I'll file a bug report for this... we'll see if it gets fixed a bit more flexible.

like image 88
Daniel Bleisteiner Avatar answered Nov 09 '22 09:11

Daniel Bleisteiner