Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract PDF form field names from a PDF form

Tags:

.net

php

pdf

I'm using pdftk to fill in a PDF form with an XFDF file. However, for this project I do not know in advance what fields will be present, so I need to analyse the PDF itself to see what fields need to be filled in, present an interface to the user accordingly, and then generate an XFDF file from that to fill in the PDF form.

How do I get the field names? Preferably command-line, .NET or PHP solutions.

like image 579
Christopher Done Avatar asked Jan 24 '10 16:01

Christopher Done


People also ask

How do I extract field names from a PDF?

To extract fillable fields in a PDF, select a completed document as a template and click Extract in Bulk on the right pane. Define the fields with data you would like to extract. Click Add New Data Field in the upper right corner and draw a rectangle around the data you'd like to extract.

How do I extract data from a fillable PDF?

Once the PDF form is open in the program, click on the "Form" > "Extra Data" button, and then select the "Extract Data" option. A new dialogue window will appear. You can then select the option of "Extract data from form fields in PDF ".

Can you copy form fields from one PDF to another?

It is not possible in one go to paste form fields to multiple other PDF's. It is possible to copy form fields from one PDF (your master document) to another PDF. In order to do this, the destination PDF must be in "Advanced Edit Mode" which is under the Edit Tab and "Convert to Editable".

How can you automatically copy the name field to every page of the PDF form?

Right-click the field and choose Duplicate from the shortcut menu. In the Duplicate Field dialog box, leave All selected if you want the field to show on each page, or click From and type the page range, such as pages 1 to 4. You don't have to worry about excluding the page where you added the original field. Click OK.


2 Answers

Easy! You are using pdftk already

# pdftk input.pdf dump_data_fields 

It will output Field name, field type, some of it's properties (like what are the options for dropdown list or text alignment) and even a Tooltip text (which I found to be extremely useful)

The only thing I'm missing is field coordinates...

like image 119
TEHEK Avatar answered Sep 17 '22 14:09

TEHEK


This worked for me:

 pdftk 1.pdf dump_data_fields output test2.txt 

Then when the file is encrypted with a password, this is how you can read from it

 pdftk 1.pdf input_pw YOUR_PASSWORD_GOES_HERE dump_data_fields output test2.txt 

This took me 2 hours to get right, so hopefully i save you some time :)

like image 20
Dev_Corps Avatar answered Sep 21 '22 14:09

Dev_Corps