I have an xml file in utf-8 with an encoding attribute.
When I execute fop -xml xml.xml -xsl xsl.xsl -pdf pdf.pdf
, my output pdf has broken utf-8 characters. What is important, the text from xsl file is without utf-8 characters, same as the text from xml.
Utf-8 characters are replaced by #.
What could be wrong?
Xsl file:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:fo="http://www.w3.org/1999/XSL/Format" xmlns:java="http://xml.apache.org/xslt/java" exclude-result-prefixes="java" version="1.0" xmlns="http://www.w3.org/1999/xhtml">
<xsl:output method="xml" version="1.0" indent="yes" encoding="UTF-8" />
<xsl:template match="/">
<fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format">
<fo:layout-master-set>
<fo:simple-page-master master-name="A4" margin="1cm">
<fo:region-body margin="2cm" margin-left="1cm" margin-right="1cm"/>
<fo:region-before extent="3cm"/>
<fo:region-after extent="5mm"/>
</fo:simple-page-master>
</fo:layout-master-set>
<fo:page-sequence master-reference="A4">
<fo:static-content flow-name="xsl-region-before">
<fo:block font-size="24pt" font-family="Calibri">Filmoteka</fo:block>
</fo:static-content>
<fo:static-content flow-name="xsl-region-after">
<fo:block font-size="10pt" font-family="Calibri">Wygenerowano: <xsl:call-template name="dataCzas" /></fo:block>
</fo:static-content>
<fo:flow flow-name="xsl-region-body">
<fo:block font-size="12pt" font-family="Calibri" padding-after="1cm">
<fo:table table-layout="fixed" width="100%" border="solid black 1px">
<fo:table-column column-width="8mm"/>
<fo:table-column column-width="40mm"/>
<fo:table-column column-width="40mm"/>
<fo:table-column column-width="13mm"/>
<fo:table-column column-width="65mm"/>
<fo:table-header>
<fo:table-row>
<fo:table-cell border="solid black 2px">
<fo:block font-weight="bold" background-color="#cccccc">Lp.</fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 2px">
<fo:block font-weight="bold" background-color="#cccccc">Tytuł PL</fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 2px">
<fo:block font-weight="bold" background-color="#cccccc">Reżyseria</fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 2px">
<fo:block font-weight="bold" background-color="#cccccc">Rok</fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 2px">
<fo:block font-weight="bold" background-color="#cccccc">Obsada</fo:block>
</fo:table-cell>
</fo:table-row>
</fo:table-header>
<fo:table-body>
<xsl:apply-templates />
</fo:table-body>
</fo:table>
</fo:block>
</fo:flow>
</fo:page-sequence>
</fo:root>
</xsl:template>
<xsl:template match="film">
<fo:table-row>
<fo:table-cell border="solid black 1px">
<fo:block><xsl:number format="1"/></fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 1px">
<fo:block font-family="Calibri"><xsl:value-of select="tytul_pol"/></fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 1px">
<fo:block font-family="Calibri"><xsl:value-of select="rezyser"/></fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 1px">
<fo:block font-family="Calibri"><xsl:value-of select="rok"/></fo:block>
</fo:table-cell>
<fo:table-cell border="solid black 1px">
<fo:block font-family="Calibri"><xsl:value-of select="obsada"/></fo:block>
</fo:table-cell>
</fo:table-row>
</xsl:template>
<xsl:template name="dataCzas">
<xsl:value-of select="java:format(java:java.text.SimpleDateFormat.new('dd MMMM yyyy, HH:mm:ss'), java:java.util.Date.new())"/>
</xsl:template>
</xsl:stylesheet>
xml file:
http://pastebin.com/fr9fChtn
The XSLT processor operates on two inputs: the XML document to transform, and the XSLT stylesheet that is used to apply transformations on the XML.
XSLT uses the <xsl:output> element to determine whether the output produced by the transformation is conformant XML (<xsl:output method="xml"/> ), valid HTML (<xsl:output method="html"/> ), or unverified text (< xsl:output method="text"/> ).
XSLT is an XML-based language that transforms an XML documents and generates output into a different format such as HTML, XHTML or PDF. XSLT is an extension of XSL, which is a stylesheet definition language for XML. XSLT is the most important part of XSL.
JavaScript can run XSLT transformations through the XSLTProcessor object. Once instantiated, an XSLTProcessor has an XSLTProcessor. importStylesheet() method that takes as an argument the XSLT stylesheet to be used in the transformation. The stylesheet has to be passed in as an XML document, which means that the .
If FOP outputs characters as #
, the selected font does not include a glyph to represent them.
This is presumably because your XML input file contains lines like:
<kraj>Francja, USA, Włochy</kraj>
The problematic character here is ł
.
So, to answer your question: FOP does support UTF-8, it is just that the font (in your case: font-family='Calibri'
) does not have a means to represent the characters.
If this indeed the case, FOP should output a warning along the lines of
WARNING: Glyph for "ł" not available in font "DejaVuSans"
Now, in order to also account for those characters not present in whatever font you have chosen, either change the output font alltogether or, as a workaround, isolate them with inlines.
For instance, this is how you make sure that for the character Σ
(a mathematical operator), the right font is selected:
<fo:block>
<fo:inline font-family='Symbol'>Σ</fo:inline>
</fo:block>
See this page for more info on fonts with FOP: http://xmlgraphics.apache.org/fop/trunk/fonts.html .
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With