I'm trying to write metadata to a pdf file using the following python code: <pre class="prettyprint"><code>from Foundation import * from Quartz import * url = NSURL.fileURLWithPath_("test.pdf") pdfdoc = PDFDocument.alloc().initWithURL_(url) assert pdfdoc, "failed to create document" print "reading pdf file" attrs = {} attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE" attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author" PDFDocumentTitleAttribute = "test" pdfdoc.setDocumentAttributes_(attrs) pdfdoc.writeToFile_("mynewfile.pdf") print "pdf made" </code></pre> This appears to work fine (no errors to the consoled), however when I examine the metadata of the file it is as follows: <pre class="prettyprint"><code>PdfID0: 242b7e252f1d3fdd89b35751b3f72d3 PdfID1: 242b7e252f1d3fdd89b35751b3f72d3 NumberOfPages: 4 </code></pre> and the original file had the following metadata: <pre class="prettyprint"><code>InfoKey: Creator InfoValue: PScript5.dll Version 5.2.2 InfoKey: Title InfoValue: Microsoft Word - PROGRESS ON THE GABION HOUSE Compressed.doc InfoKey: Producer InfoValue: GPL Ghostscript 8.15 InfoKey: Author InfoValue: PWK InfoKey: ModDate InfoValue: D:20101021193627-05'00' InfoKey: CreationDate InfoValue: D:20101008152350Z PdfID0: d5fd6d3960122ba72117db6c4d46cefa PdfID1: 24bade63285c641b11a8248ada9f19 NumberOfPages: 4 </code></pre> So the problems are, it is not appending the metadata, and it is clearing the previous metadata structure. What do I need to do to get this to work? My objective is to append metadata that reference management systems can import.

Mark is on the right track, but there are a few peculiarities that should be accounted for. First, he is correct that <code>pdfdoc.documentAttributes</code> is an <code>NSDictionary</code> that contains the document metadata. You would like to modify that, but note that <code>documentAttributes</code> gives you an <code>NSDictionary</code>, which is immutable. You have to convert it to an <code>NSMutableDictionary</code> as follows: <pre class="prettyprint"><code>attrs = NSMutableDictionary.alloc().initWithDictionary_(pdfDoc.documentAttributes()) </code></pre> Now you can modify <code>attrs</code> as you did. There is no need to write <code>PDFDocument.PDFDocumentTitleAttribute</code> as Mark suggested, that one won't work, <code>PDFDocumentTitleAttribute</code> is declared as a module-level constant, so just do as you did in your own code. Here is the full code that works for me: <pre class="prettyprint"><code>from Foundation import * from Quartz import * url = NSURL.fileURLWithPath_("test.pdf") pdfdoc = PDFDocument.alloc().initWithURL_(url) attrs = NSMutableDictionary.alloc().initWithDictionary_(pdfdoc.documentAttributes()) attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE" attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author" pdfdoc.setDocumentAttributes_(attrs) pdfdoc.writeToFile_("mynewfile.pdf") </code></pre>

Writing metadata to a pdf using pyobjc

Tags:

python

pdf

cocoa

pdfkit

pyobjc

I'm trying to write metadata to a pdf file using the following python code:

from Foundation import *
from Quartz import *

url = NSURL.fileURLWithPath_("test.pdf")
pdfdoc = PDFDocument.alloc().initWithURL_(url)
assert pdfdoc, "failed to create document"

print "reading pdf file"

attrs = {}
attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE"
attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author"

PDFDocumentTitleAttribute = "test"

pdfdoc.setDocumentAttributes_(attrs)
pdfdoc.writeToFile_("mynewfile.pdf")   

print "pdf made"

This appears to work fine (no errors to the consoled), however when I examine the metadata of the file it is as follows:

PdfID0:
242b7e252f1d3fdd89b35751b3f72d3
PdfID1:
242b7e252f1d3fdd89b35751b3f72d3
NumberOfPages: 4

and the original file had the following metadata:

InfoKey: Creator
InfoValue: PScript5.dll Version 5.2.2
InfoKey: Title
InfoValue: Microsoft Word - PROGRESS  ON  THE  GABION  HOUSE Compressed.doc
InfoKey: Producer
InfoValue: GPL Ghostscript 8.15
InfoKey: Author
InfoValue: PWK
InfoKey: ModDate
InfoValue: D:20101021193627-05'00'
InfoKey: CreationDate
InfoValue: D:20101008152350Z
PdfID0: d5fd6d3960122ba72117db6c4d46cefa
PdfID1: 24bade63285c641b11a8248ada9f19
NumberOfPages: 4

So the problems are, it is not appending the metadata, and it is clearing the previous metadata structure. What do I need to do to get this to work? My objective is to append metadata that reference management systems can import.

597

asked Nov 04 '10 19:11

djq

1 Answers

Mark is on the right track, but there are a few peculiarities that should be accounted for.

First, he is correct that pdfdoc.documentAttributes is an NSDictionary that contains the document metadata. You would like to modify that, but note that documentAttributes gives you an NSDictionary, which is immutable. You have to convert it to an NSMutableDictionary as follows:

attrs = NSMutableDictionary.alloc().initWithDictionary_(pdfDoc.documentAttributes())

Now you can modify attrs as you did. There is no need to write PDFDocument.PDFDocumentTitleAttribute as Mark suggested, that one won't work, PDFDocumentTitleAttribute is declared as a module-level constant, so just do as you did in your own code.

Here is the full code that works for me:

from Foundation import *
from Quartz import *

url = NSURL.fileURLWithPath_("test.pdf")
pdfdoc = PDFDocument.alloc().initWithURL_(url)

attrs = NSMutableDictionary.alloc().initWithDictionary_(pdfdoc.documentAttributes())
attrs[PDFDocumentTitleAttribute] = "THIS IS THE TITLE"
attrs[PDFDocumentAuthorAttribute] = "A. Author and B. Author"

pdfdoc.setDocumentAttributes_(attrs)
pdfdoc.writeToFile_("mynewfile.pdf")

answered Sep 28 '22 05:09

Tamás

Related questions
                            
                                How do I get Python's Mechanize to POST an ajax request?
                            
                                Checking whether a link is dead or not using Python without downloading the webpage
                            
                                Disconnecting from host with Python Fabric when using the API
                            
                                How to get started with a bare-bones Eclipse + PyDev
                            
                                Escaping [ in Python Regular Expressions
                            
                                Sharing data between processes in Python
                            
                                "Large" scale spell checking in Python
                            
                                Error in Python Mechanize - "mechanize._mechanize.BrowserStateError: not viewing HTML"
                            
                                Printing objects and unicode, what's under the hood ? What are the good guidelines?
                            
                                CSS Templating system for Django / Python?
                            
                                Would twisted be a good choice for building a multi-threaded server?
                            
                                Testing workflows in Django
                            
                                python csv reader - convert string to int on the for line when iterating
                            
                                Python print works differently on different servers
                            
                                allow_none in twisted XML-RPC server
                            
                                Test framework allowing tests to depend on other tests
                            
                                Minimize subqueries with IN queries on AppEngine (python)
                            
                                pylint bug - E1101 & E0102 upon use of @property + @foo.setter
                            
                                How can I use a Perl module from Python?
                            
                                Phylo BioPython building trees

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With