Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PDFtk throws a Java Exception when attempting to use 'fill_form' function

I have a PHP application that fills out a form from a database call. At present I am putting this together using PDFtk, I am able to run a number of PDFtk commands with no issue and I am currently working out the desired command at command line.

My call is currently this:

pdftk /var/www/html/CSR/template/job_card.pdf fill_form /var/www/html/CSR/template/wwwwu7mMH.fdf output /var/www/html/CSR/template/filled4.pdf

This exact call run multiple times generates this error sometimes:

    Unhandled Java Exception in create_output():
java.lang.ClassCastException: pdftk.com.lowagie.text.pdf.PdfNull cannot be cast to pdftk.com.lowagie.text.pdf.PdfDictionary
   at pdftk.com.lowagie.text.pdf.FdfReader.readFields(pdftk)
   at pdftk.com.lowagie.text.pdf.FdfReader.readPdf(pdftk)
   at pdftk.com.lowagie.text.pdf.PdfReader.<init>(pdftk)
   at pdftk.com.lowagie.text.pdf.PdfReader.<init>(pdftk)
   at pdftk.com.lowagie.text.pdf.FdfReader.<init>(pdftk)

and this error sometimes:

Unhandled Java Exception in create_output():
Unhandled Java Exception in main():
java.lang.NullPointerException
   at gnu.gcj.runtime.NameFinder.lookup(libgcj.so.10)
   at java.lang.Throwable.getStackTrace(libgcj.so.10)
   at java.lang.Throwable.stackTraceString(libgcj.so.10)
   at java.lang.Throwable.printStackTrace(libgcj.so.10)
   at java.lang.Throwable.printStackTrace(libgcj.so.10)

The error message alternates but the command never works and the form is never filled. As I say though, the PDFtk works with other commands, I have been able to generate encrypted PDFs and run the fixed commands succesfully.

My question is what is causing this error and how do I fix it?

like image 473
user3192649 Avatar asked Apr 14 '16 04:04

user3192649


3 Answers

I see my name in the StackTrace. That's not a coincidence: PdfTk is based on a mighty old version of iText. iText is a Java PDF library that was originally written by me, but used by a third party to create PdfTk.

The error tells you that iText is parsing a PDF that has either an error, or an unexpected feature.

A PDF consists of PDF objects such as PDF string objects, PDF number objects, PDF array objects, PDF dictionary objects, PDF stream objects, and so on. iText is able to retrieve these objects and to reuse them to create a new PDF. In your case, a new PDF with some form fields that are filled out is created based on the objects of the original PDF.

It is impossible to answer your question without seeing the PDF that causes the problem, but let's say that your PDF contains an /AcroForm entry with a /Fields array. In this fields array, there is a reference to a field dictionary. Suppose that one of the field dictionaries in your PDF isn't a dictionary, but a PDF null object. The form shows up perfectly in Adobe Reader, but internally, there is a flaw that prevents proper processing of the form.

In that case, iText will loop over the entries in the fields array, and one of those entries won't return a field dictionary, but a PdfNull object. In that case, you'll get a ClassCastException, because you can't cast PdfNull to PdfDictionary.

This being said:

  • If I see my name in your stack trace, this triggers an alarm, because it means that you're using an iText version that predates iText 5. Such a version should no longer be used. You should use a more recent version of iText. There is a high chance that a more recent version of iText gives you either a better error message, or tolerates (and maybe even fixes) the error in the PDF.
  • If you find a PdfTk version that uses a more recent version of iText, that would surprise me, because as far as I know, PdfTk isn't available under the AGPL, nor is PDF Labs (the owner of PdfTk) a customer of iText Software.
  • If you want to keep on using PdfTk, you shouldn't expect an answer as long as you don't share the PDF document that you're trying to fill.

One thing you could try: open the form in Adobe Acrobat. Save the form in Adobe Acrobat. There is a chance that the saved form no longer has the problem. Adobe Acrobat is very tolerant towards errors in PDFs. It tries to fix as many as it can. Then when you save the form, the error is gone.

like image 189
Bruno Lowagie Avatar answered Oct 17 '22 17:10

Bruno Lowagie


As it turns out the issue was not as Bruno Lowagie suggested regarding the consistency of the PDF.

I had run out of ideas and just thought I would try generating the FDF a different way. By running the command:

pdftk /full/path/to/template.pdf generate_fdf output /full/path/to/output.fdf

And then inspecting the resulting file, I was able to get a more accurate FDF and then when I ran the fill_form command:

pdftk /full/path/to/template.pdf fill_form /full/path/to/output.fdf output /full/path/to/output.pdf

I got a proper response and everything worked. So the problem I was getting was in fact caused by the FDF being malformed in some way.

My final solution was this if anyone is interested. It takes a template PDF with fields, generates an FDF to fill it, creates a new PDF by adding the data from FDF with the template PDF, redirects the browser to the PDFs location.

Big thanks to Bruno Lowagie for helping understand the system better and rule out a few things.

like image 45
user3192649 Avatar answered Oct 17 '22 17:10

user3192649


It looks like PDF TK was not able to process stings that had char ( and ) I replaced them with \) and \( to escape them, and it worked well.

like image 4
Guest Avatar answered Oct 17 '22 17:10

Guest