Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Programmatically add comments to PDF header

Tags:

pdf

Has anyone had any success with adding additional information to a PDF file?

We have an electronic medical record system which produces medical documents for our users. In the past, those documents have been Print-To-File (.prn) files which we have fed to a system that displayed them as part of an enterprise medical record.

Now the hospital's enterprise medical record vendor wants to receive the documents as PDF, but still wants all of the same information stored in the header.

Honestly, we can't figure out how to put information into a PDF file that doesn't break the PDF file.

Here is the start of one of our PDFs...

%PDF-1.4  
%âãÏÓ  
6 0 obj  
<<  
   /Type /XObject  
   /Subtype /Image  
   /BitsPerComponent 8  
   /Width 854  
   /Height 130  
   /ColorSpace /DeviceRGB  
   /Filter /DCTDecode  
   /Length 17734>>  
stream  

In our PRN files, we would insert information like this:

%MRN% TEST000001
%ACCT% TEST0000000000001
%DATE% 01/01/2009^16:44
%DOC_TYPE% Clinical
%DOC_NUM% 192837475
%DOC_VER% 1

My question is, can I insert this information into a PDF in a manner which allows the document server to perform post-processing, yet is NOT visible to the doctor who views the PDF?

Thank you,

David Walker

like image 831
David Walker Avatar asked Jun 09 '09 19:06

David Walker


People also ask

Can I add comments to a PDF file?

Add a text comment Use the Add Text Comment tool to type text anywhere on the PDF page. The Add Text Comment tool is similar to the Add Text Box tool. Choose the Add Text Comment tool from the Comment toolbar.

How do I show comments in a PDF bar?

Go to Edit > Preferences (in Windows), or Acrobat > Preferences (in macOS). The preferences dialog box is displayed. In the Commenting category, select Show Checkbox under Making Comments. Click OK.


2 Answers

Yes, you can. Any line in a PDF file that starts with a percent sign is a comment and as such ignored (the first two lines of the PDF actually are comments as well). So you can pretty much insert your information into the PDF as you did into the PRN.

However:

The PDF format works with byte position references, so if you insert data into a finished PDF file, this will push the rest of the data away from their original position and thus break the file. You can also not append it to the file, because a PDF file has to end with

startxref
123456
%%EOF

(the 123456 is an example). You could insert your data right before these three lines. The byte position of the "startxref" part is never referenced anywhere, so you won't break anything if you push this final part towards the end.

Edit: This of course assumes there is no checksumming, signing or encryption going on. That would make things more complicated.

Edit 2: As Javier pointed out correctly, you can also just add your data to the end and just add a copy of the three lines to the end of that. Boils down to the same thing, but it's a little easier.

like image 164
balpha Avatar answered Dec 21 '22 08:12

balpha


PDFs are supposed to have multiple versions just appending at the end; but the very end must have the offset to the main reference table. Just read the last three lines, append your data and reattach the original ending.

You can either remove the original ending or let it there. PDF readers will just go to the end and use the second-to-last line to find the reference table.

like image 34
Javier Avatar answered Dec 21 '22 09:12

Javier