Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I Combine/Merge PDFs with Fillable Form Fields using iTextSharp?

Using iTextSharp, how can I merge multiple PDFs into one PDF without losing the Form Fields and their properties in each individual PDF?

(I would prefer an example using streams from a database but file system is ok as well)

I found this code that works but it flattens out my PDFs so I can't use it.

UPDATE

@Mark Storer - This is the code I am using now based on your feedback (see below) but it gives me a corrupt document after the save. I tested each of the code parts separately and it seems to be failing in the MergePdfForms function shown below. I obviously don't want to use the renameFields part of your example because I need the field names to remain "as is".

Public Sub MergePdfForms(ByVal pdfFiles As ArrayList, ByVal outputPath As String)
    Dim ms As New IO.MemoryStream()
    Dim copier As New PdfCopyFields(ms)
    For Each pfile As String In pdfFiles
        Dim reader As New PdfReader(pfile)
        copier.AddDocument(reader)
    Next
    SaveMemoryStream(ms, outputPath)
    copier.Close()
End Sub

Public Sub SaveMemoryStream(ms As IO.MemoryStream, FileName As String)
    Dim outStream As IO.FileStream = IO.File.OpenWrite(FileName)
    ms.WriteTo(outStream)
    outStream.Flush()
    outStream.Close()
End Sub
like image 579
RichC Avatar asked Jun 13 '11 03:06

RichC


2 Answers

Fields in PDFs have an Unusual Property: All fields with the same name are the same field. They share a value. This is handy when the form refers to the same person and you have a nice naming scheme across forms. It's Not Handy when you want to put 20 instances of a single form into a single PDF.

This makes merging multiple forms challenging, to say the least. The most common option (thanks to iText), is to flatten the forms prior to merging them, at which point you're no long merging forms, and the problem Goes Away.

The other option is to rename your fields prior to merging them. This can make data extraction difficult later, can break scripts, and is generally a PITA. That's why flattening is so much more popular.

There's a class in iText called PdfCopyFields, and it will correctly copy fields from one document to another... it will also merge fields with the same name correctly, such that they really share a single value and Acrobat/Reader doesn't have to do a bunch of extra work on the file to get it that way before displaying it to a user.

However, PdfCopyFields will not rename fields for you. To do that, you need to get the AcroFields object from the PdfReader in question, and call renameField(String, String) on Each And Every Field prior to merging the documents with PdfCopyFields.

All this is for "AcroForm"-based PDF forms. If you're dealing with XFA forms (forms from LiveCycle Designer), all bets are off. You have to muck with the XML, A Lot.

And heaven help you if you have to combine forms from both.

So ass-u-me-ing that you're working with AcroForm fields, the code might look something like this (forgive my Java):

public void mergeForms(String outpath, String inPaths[]) {
  PdfCopyFields copier = new PdfCopyFields(new FileOutputStream(outpath) );
  for (String curInPath : inPaths) {
    PdfReader reader = new PdfReader(curInPath);
    renameFields(reader.getAcroFields());

    copier.addDocument(reader);
  }
  copier.close();
}

private static int counter = 0;
private void renameFields(AcroFields fields) {
  Set<String> fieldNames = fields.getFields().keySet();
  String prepend = String.format("_%d.", counter++);

  for(String fieldName : fieldNames) {
    fields.rename(fieldName, prepend + fieldName);
  }
}

Ideally, renameFields would also create a generic field object named prepend's-value and make all the other fields in the document it's children. This would make Acrobat/Reader's life easier and avoid an apparently unnecessary "save changes?" request when closing the resulting PDF from Acrobat.

Yes, that's why Acrobat will sometimes ask you to save changes when You Didn't Do Anything! Acrobat did something behind the scenes.

like image 103
Mark Storer Avatar answered Oct 13 '22 11:10

Mark Storer


you can also use this code.... it will merge all the pdf file without losing field value..

    Document document = new Document();
    try
        {         
           string destinationfile = desktopPath.Replace(@"d:\outputfile.pdf");
           PdfCopyFields copier = new PdfCopyFields(new FileStream(destinationfile,     FileMode.Create));
            PdfImportedPage page;

            //Loops for each file that has been listed
            foreach (string filename in fileList)
            {
                flag++;
                try
                {
                    //The current file path
                    string filePath = sourcefolder + filename;

                    PdfReader reader = new PdfReader(filePath);
                    copier.AddDocument(reader);

                }
                catch
                {

                }
            }
            copier.Close();
        }
like image 32
manu Avatar answered Oct 13 '22 11:10

manu