Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Itext create XMP in pdf with Java

I need to create in java (using itext) the following xmp metadata and to put it in one of my pdf.

<rdf:Description rdf:about="" xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#" xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#" xmlns:pdfaType="http://www.aiim.org/pdfa/ns/type#" xmlns:pdfaField="http://www.aiim.org/pdfa/ns/field#"> <pdfaExtension:schemas>
<rdf:Bag>
<rdf:li rdf:parseType="Resource">
<pdfaSchema:schema>ABI Assegni Schema</pdfaSchema:schema> <pdfaSchema:namespaceURI>http://abi.it/std/cheque/xmlns</pdfaSchema:namespaceURI> <pdfaSchema:prefix>assegni</pdfaSchema:prefix>
<pdfaSchema:property>
    <rdf:Seq>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>IDDocumento</pdfaProperty:name> <pdfaProperty:valueType>Text</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Identificativo univoco del documento</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>datachiusura</pdfaProperty:name> <pdfaProperty:valueType>Date</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Data e ora della produzione del file</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>oggettodocumento</pdfaProperty:name> <pdfaProperty:valueType>Text</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Oggetto del documento</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>soggettoproduttore</pdfaProperty:name> <pdfaProperty:valueType>soggetto</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Soggetto produttore</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>destinatario</pdfaProperty:name> <pdfaProperty:valueType>soggetto</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Destinatario</pdfaProperty:description>
</rdf:li> </rdf:Seq>

</pdfaSchema:property>
</rdf:li>
</rdf:Bag>
</pdfaExtension:schemas>
</rdf:Description>

Until now I tried this portion of code:

PdfReader reader = new PdfReader(baos.toByteArray());
        PdfAStamper stamper = new PdfAStamper(reader, baos,     PdfAConformanceLevel.PDF_A_1B);

        String namespaceExtension = new    String("http://www.aiim.org/pdfa/ns/extension/");
        String namespaceSchema = new String("http://www.aiim.org/pdfa/ns/schema#");
        String namespaceProperty = new String("http://www.aiim.org/pdfa/ns/property#");
        String namespaceType = new String("http://www.aiim.org/pdfa/ns/type#");
        String namespaceField = new String("http://www.aiim.org/pdfa/ns/field#");
        XMPSchemaRegistry registry = XMPMetaFactory.getSchemaRegistry();
        registry.registerNamespace(namespaceExtension, "pdfaExtension");
        registry.registerNamespace(namespaceSchema, "pdfaSchema");
        registry.registerNamespace(namespaceProperty, "pdfaProperty");
        registry.registerNamespace(namespaceType, "pdfaType");
        registry.registerNamespace(namespaceField, "pdfaField");

        XmpWriter w = new XmpWriter(baos);
        w.appendArrayItem(namespaceExtension, "schemas", "a");

        w.close();

        writer.setXmpMetadata(baos.toByteArray());

And the created xmp is the following:

<pdfaExtension:schemas>
    <rdf:Bag>
      <rdf:li>a</rdf:li>
    </rdf:Bag>

Now I can't understand on how to go on. Any idea on how to do this?

Thanks in advance

like image 527
Giamma Avatar asked Jul 26 '16 14:07

Giamma


1 Answers

I am able to answer the question as phrased with iText5 although I consider this answer to be a bit of a "hack", in the sense that it makes no use of any iText semantic metadata objects, most of which seem to be deprecated. Note that in particular, xmp.DublinCoreSchema, xmp.PdfSchema, xmp.XmpArray and xmp.XmpSchema are deprecated while xmp.CustomSchema no longer exists.

The iText documentation is very poor in this regard.

The answer should be available here or here or here but none of these helped. They only show how to manipulate the info section.

A solution can be derived from the thread Adding & retrieve custom properties to PDF using XMP , but all the iText classes used are deprecated.

In the end, I noticed that any XML can be inserted via stamper.setXmpMetadata(metadata) where metadata is a byte[] containing XML. This XML could be created with DOM, but in the following quick-and-dirty a file is used.

package itext.sandpit;

import com.itextpdf.text.Document;
import com.itextpdf.text.DocumentException;
import com.itextpdf.text.Paragraph;
import com.itextpdf.text.pdf.PdfReader;
import com.itextpdf.text.pdf.PdfStamper;
import com.itextpdf.text.pdf.PdfWriter;
import com.itextpdf.xmp.XMPException;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;


public class ItextSandpit {

    public static void main(String[] args) throws DocumentException,
            IOException,
            XMPException {

        // Create PDF
        Document document = new Document();
        PdfWriter.getInstance(
                document, new FileOutputStream("mypdf.pdf"));
        document.open();
        document.add(new Paragraph("Hello World!"));
        document.close();

        // Read metadata
        File fff = new File("metadata.xml");
        FileInputStream fileInputStream = new FileInputStream(fff);
        int byteLength = (int) fff.length(); //bytecount of the file-content
        byte[] metadatabytes = new byte[byteLength];
        fileInputStream.read(metadatabytes, 0, byteLength);

        // Add metadata
        PdfReader reader = new PdfReader("mypdf.pdf");
        PdfStamper stamper = new PdfStamper(reader, new FileOutputStream("mypdf_plus_xmp.pdf"));

        stamper.setXmpMetadata(metadatabytes);
        stamper.close();
        reader.close();
    }

}

Create a file metadata.xml and copy and paste the XML from the OP into this file, and run. To confirm that the metadata is really is inside the created PDF, pdfinfo -meta mypdf_plus_xmp.pdf yields

Producer:       iText® 5.5.12 ©2000-2017 iText Group NV (AGPL-version); modified using iText® 5.5.12 ©2000-2017 iText Group NV (AGPL-version)
CreationDate:   Tue Oct 10 21:01:21 2017
ModDate:        Tue Oct 10 21:01:21 2017
Tagged:         no
UserProperties: no
Suspects:       no
Form:           none
JavaScript:     no
Pages:          1
Encrypted:      no
Page size:      595 x 842 pts (A4)
Page rot:       0
File size:      3224 bytes
Optimized:      no
PDF version:    1.4
Metadata:
<rdf:Description rdf:about="" xmlns:pdfaExtension="http://www.aiim.org/pdfa/ns/extension/" xmlns:pdfaSchema="http://www.aiim.org/pdfa/ns/schema#" xmlns:pdfaProperty="http://www.aiim.org/pdfa/ns/property#" xmlns:pdfaType="http://www.aiim.org/pdfa/ns/type#" xmlns:pdfaField="http://www.aiim.org/pdfa/ns/field#"> <pdfaExtension:schemas>
<rdf:Bag>
<rdf:li rdf:parseType="Resource">
<pdfaSchema:schema>ABI Assegni Schema</pdfaSchema:schema> <pdfaSchema:namespaceURI>http://abi.it/std/cheque/xmlns</pdfaSchema:namespaceURI> <pdfaSchema:prefix>assegni</pdfaSchema:prefix>
<pdfaSchema:property>
    <rdf:Seq>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>IDDocumento</pdfaProperty:name> <pdfaProperty:valueType>Text</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Identificativo univoco del documento</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>datachiusura</pdfaProperty:name> <pdfaProperty:valueType>Date</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Data e ora della produzione del file</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>oggettodocumento</pdfaProperty:name> <pdfaProperty:valueType>Text</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Oggetto del documento</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>soggettoproduttore</pdfaProperty:name> <pdfaProperty:valueType>soggetto</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Soggetto produttore</pdfaProperty:description>
</rdf:li>
<rdf:li rdf:parseType="Resource">
<pdfaProperty:name>destinatario</pdfaProperty:name> <pdfaProperty:valueType>soggetto</pdfaProperty:valueType> <pdfaProperty:category>external</pdfaProperty:category> <pdfaProperty:description>Destinatario</pdfaProperty:description>
</rdf:li> </rdf:Seq>

</pdfaSchema:property>
</rdf:li>
</rdf:Bag>
</pdfaExtension:schemas>
</rdf:Description>

If possible, use an iText "wrapper" such as PDFBox or move to iText7.

like image 122
fundagain Avatar answered Oct 11 '22 00:10

fundagain