Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can iTextSharp convert PDF document to PDF/A

I cannot locate within the FAQ whether this functionality exists in the API although its mentioned in a book as something that is potentially available. Has anyone had any experience implementing this feature?

like image 953
JohnL Avatar asked Mar 02 '10 21:03

JohnL


People also ask

What is Itextsharp used for?

Itextsharp is an advanced tool library which is used for creating complex pdf repors. itext is used by different techonologies -- Android , . NET, Java and GAE developer use it to enhance their applications with PDF functionality.


2 Answers

On This thread (dated June 2007) Paulo Soares provides code that shows support for PDF/A. Here's the C# code (he also has a Java sample):

private void PdfATest() {
    Document doc = new Document(PageSize.A4);
    PdfWriter writer = PdfWriter.GetInstance(doc, new FileStream("C:\\hello_A1-b_cs.pdf", FileMode.Create));
    writer.PDFXConformance = PdfWriter.PDFA1B;
    doc.Open();

    PdfDictionary outi = new PdfDictionary(PdfName.OUTPUTINTENT);
    outi.Put(PdfName.OUTPUTCONDITIONIDENTIFIER, new PdfString("sRGB IEC61966-2.1"));
    outi.Put(PdfName.INFO, new PdfString("sRGB IEC61966-2.1"));
    outi.Put(PdfName.S, PdfName.GTS_PDFA1);

    // get this file here: http://old.nabble.com/attachment/10971467/0/srgb.profile
    ICC_Profile icc = ICC_Profile.GetInstance("c:\\srgb.profile");
    PdfICCBased ib = new PdfICCBased(icc);
    ib.Remove(PdfName.ALTERNATE);
    outi.Put(PdfName.DESTOUTPUTPROFILE, writer.AddToBody(ib).IndirectReference);

    writer.ExtraCatalog.Put(PdfName.OUTPUTINTENTS, new PdfArray(outi));

    BaseFont bf = BaseFont.CreateFont("c:\\windows\\fonts\\arial.ttf", BaseFont.WINANSI, true);
    Font f = new iTextSharp.text.Font(bf, 12);
    doc.Add(new Paragraph("hello", f));

    writer.CreateXmpMetadata();

    doc.Close();
}

The link above includes a download for the ICC_Profile file.

like image 168
Jay Riggs Avatar answered Oct 13 '22 22:10

Jay Riggs


Here's my method for parsing an HTML file and creating a PDF/A archive document from it, also with embedding fonts by using a stylesheet (in order to avoid the error: "All the fonts must be embedded. This one isn't: Helvetica")

Hope this helps someone..

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using iTextSharp.text.pdf;
using iTextSharp.text;
using System.IO;
using iTextSharp.text.html.simpleparser;

namespace SaveAsPDF
{
    class HtmlPdfConverter
    {
        public void RendererWebForm2PDFArchive(string fileName)
        {
            Console.WriteLine("Parsing HTML " + fileName);
            Document document = new Document(PageSize.A4);

            try
            {
                // we create a writer that listens to the document and directs a XML-stream to a file
                PdfWriter writer = PdfWriter.GetInstance(document, new FileStream(fileName + ".pdf", FileMode.Create));

                //set document as arhive
                writer.PDFXConformance = PdfWriter.PDFA1A;
                document.Open();

                //apply stylesheet to change font (and embedd it)
                StyleSheet styles = new StyleSheet();
                FontFactory.Register("c:\\windows\\fonts\\verdana.ttf");
                styles.LoadTagStyle("body", "face", "Verdana");

                //prepare html
                StreamReader sr = new StreamReader(fileName, Encoding.Default);
                string html = sr.ReadToEnd();                                
                html = RemoveTag(html, "<title>", "</title>");                

                //convert string to stream
                byte[] byteArray = Encoding.UTF8.GetBytes(html);
                MemoryStream ms = new MemoryStream(byteArray);

                //parse html
                HTMLWorker htmlWorker = new HTMLWorker(document);
                System.Collections.Generic.List<IElement> elements;

                elements = HTMLWorker.ParseToList(new StreamReader(ms), styles);
                foreach (IElement item in elements)
                {
                    document.Add(item);
                }

                writer.CreateXmpMetadata();
                document.Close();
                Console.WriteLine("Done");
            }
            catch (Exception e)
            {
                Console.Error.WriteLine(e.Message);
                Console.Error.WriteLine(e.StackTrace);
            }
        }**strong text**
like image 40
Daniel Bergsten Avatar answered Oct 13 '22 23:10

Daniel Bergsten