Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Apache FOP: Displaying UTF-8 Characters in PDF (without embed?)

I'm trying to use FOP to export a PDF with UTF-8 characters, preferably without needing to embed the font.
The following code:

<fo:block font="10pt Helvetica" text-align="justify" space-after="10pt" space-before="8pt" keep-with-previous="auto" keep-together.within-page="auto"> 
  <fo:block font-weight="bold" color="gray">Summary</fo:block>
  <fo:block text-indent="1em" keep-with-previous="always">
    <fo:block text-indent="1em" space-before="4pt">
      <fo:block text-indent="1em" space-before="4pt">私はガラスを食べられます。それは私を傷つけません
      </fo:block>
    </fo:block>
  </fo:block>
</fo:block>

produces #################### in the PDF. I'm aware of the issue: http://xmlgraphics.apache.org/fop/faq.html#pdf-characters

When I go under Document Properties->Fonts, the Helvetica font is listed with 'Encoding: ANSI'. Is there a way to change this?

If I were embedding, what would be the best way to do so without having access to Helvetica.ttf? I've tried using DejaVuSans, but I end up with squares in place of the # signs.

Note that this is not a one-time use from the command line (that would be a start), but an extension to an existing app. I'm trying to support UTF-8 characters without too much complexity.

like image 225
Tyler D Avatar asked Aug 19 '09 20:08

Tyler D


1 Answers

AFAICT, the fonts included in the PDF specification only include characters from ISO-Latin-1. If you want a character that falls outside of those defined in Annex D: Character Sets and Encodings, then you are expected to embed the font.

like image 80
D.Shawley Avatar answered Nov 17 '22 20:11

D.Shawley