I have the code below
from PyPDF2 import PdfFileReader, PdfFileWriter
d = {
"Name": "James",
" Date": "1/1/2016",
"City": "Wilmo",
"County": "United States"
}
reader = PdfFileReader("medicareRRF.pdf")
inFields = reader.getFields()
watermark = PdfFileReader("justSign.pdf")
writer = PdfFileWriter()
page = reader.getPage(0)
page.mergePage(watermark.getPage(0))
writer.addPage(page)
written_page = writer.getPage(0)
writer.updatePageFormFieldValues(written_page, d)
Which correctly fills in the PDF with the dictionary (d), but how can I check and uncheck boxes on the PDF? Here is the getField() info for one of the boxes:
u'Are you ok': {'/FT': '/Btn','/Kids': [IndirectObject(36, 0),
IndirectObject(38, 0)],'/T': u'Are you ok','/V': '/No'}
I tried adding {'Are you ok' : '/Yes'}
and several other similar ways, but nothing worked.
Once the checkbox is selected, we are calling prop() function as prop( "checked", true ) to check the checkbox and prop( "checked", false ) to uncheck the checkbox.
PyPDF2: It is a python library used for performing major tasks on PDF files such as extracting the document-specific information, merging the PDF files, splitting the pages of a PDF file, adding watermarks to a file, encrypting and decrypting the PDF files, etc.
I came across the same issue, looked in several places, and was disappointed that I couldn't find the answer. After a few frustrating hours looking at my code, the pyPDF2 code, and the Adobe PDF 1.7 spec, I finally figured it out. If you debug into updatePageFormFieldValues, you'll see that it uses only TextStringObjects. Checkboxes are not text fields -- even the /V values are not text fields, which seemed counterintuitive at least to me. Debugging into that function showed me that checkboxes are instead NameObjects so I created my own function to handle them. I create two dicts: one with only text values that I pass to the built-in updatePageFormFieldValues function and a second with only checkbox values. I also set the /AS to ensure visibility (see PDF spec). My function looks like this:
def updateCheckboxValues(page, fields):
for j in range(0, len(page['/Annots'])):
writer_annot = page['/Annots'][j].getObject()
for field in fields:
if writer_annot.get('/T') == field:
writer_annot.update({
NameObject("/V"): NameObject(fields[field]),
NameObject("/AS"): NameObject(fields[field])
})
However, as far as I can tell, whether you use /1, /On, or /Yes depends on how the form was defined or perhaps what the PDF reader is looking for. For me, /1 worked.
I will like to add on to the answer @rpsip.
from PyPDF2 import PdfReader, PdfWriter
from PyPDF2.generic import NameObject
reader = PdfReader(r"form2.pdf") #where you read the pdf in the same directory
writer = PdfWriter()
page = reader.pages[0] #read page 1 of your pdf
fields = reader.get_fields()
print (fields) # this is to identify if you can see the form fills in that page
writer.add_page(page) #this line is necessary otherwise the pdf will be corrupted
for i in range(len(page["/Annots"])): #in order to access the "Annots" key
print ((page["/Annots"][i].get_object())) #to find out which of the form fills are checkbox or text fill
if (page["/Annots"][i].get_object())['/FT']=="/Btn" and (page["/Annots"][i].get_object())['/T']=='Check Box3': #this is my filter so that I can filter checkboxes and the checkbox I want i.e. "Check Box 3"
print (page["/Annots"][i].get_object()) #further check if I got what I wanted as per the filter
writer_annot = page["/Annots"][i].get_object()
writer_annot.update(
{
NameObject("/V"): NameObject(
"/Yes"), #NameObject being only for checkbox, and please try "/Yes" or "/1" or "/On" to see which works
NameObject("/AS"): NameObject(
"/Yes" #NameObject being only for checkbox, and please try "/Yes" or "/1" or "/On" to see which works
)
}
)
with open("filled-out.pdf", "wb") as output_stream:
writer.write(output_stream) #save the ticked pdf file as another file named "filled-out.pdf"
hoped I helped.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With