Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I read PDF document properties using Perl and CAM::PDF?

Tags:

pdf

perl

cam-pdf

I want to read some PDF document property with Perl. I already have CAM::PDF installed on my system.

Is there an option to use this module to read the properties of a PDF document? If yes could someone give an example or refer to the relevant subroutine that does this?

Or, should I use another module? If yes which module?

like image 514
smith Avatar asked Nov 17 '11 17:11

smith


People also ask

What is document properties in PDF?

View document properties. When you view a PDF, you can get information about it, such as the title, the fonts used, and security settings. Some of this information is set by the person who created the document, and some is generated automatically.

Can I edit properties of a PDF file?

Choose File > Properties, and then select Custom. To add a property, type the name and value, and then click Add. To change the properties, do any of the following, and then click OK: To edit a property, select it, change the Value, and then click Change.


2 Answers

I like the PDF::API2 answer from Sinan Ünür. PDF::API2 is awesome.

I'm the author of CAM::PDF. Sorry I missed this question earlier. CAM::PDF comes with a cmdline tool to extract this sort of data (pdfinfo.pl).

My library does not support this officially, but it's easy to do if you don't mind hacking into internals.

#!perl -w                                                                                                                            
use strict;
use CAM::PDF;
my $infile = shift || die 'syntax...';
my $pdf = CAM::PDF->new($infile) || die;
my $info = $pdf->getValue($pdf->{trailer}->{Info});
if ($info) {
    for my $key (sort keys %{$info}) {
        my $value = $info->{$key};
        if ($value->{type} eq 'string') {
            print "$key: $value->{value}\n";
        } else {
            print "$key: <$value->{type}>\n";
        }
    }
}
like image 147
Chris Dolan Avatar answered Sep 21 '22 19:09

Chris Dolan


I do not know much about CAM::PDF. However, if you are willing to install PDF::API2, you can do:

#!/usr/bin/env perl

use strict; use warnings;

use Data::Dumper;
use PDF::API2;

my $pdf = PDF::API2->open('U3DElements.pdf');

print Dumper { $pdf->info };

Output:

$VAR1 = {
          'ModDate' => 'D:20090427131238-07\'00\'',
          'Subject' => 'Adobe Acrobat 9.0 SDK',
          'CreationDate' => 'D:20090427125930Z',
          'Producer' => 'Acrobat Distiller 9.0.0 (Windows)',
          'Creator' => 'FrameMaker 7.2',
          'Author' => 'Adobe Developer Support',
          'Title' => 'U3D Supported Elements'
        };
like image 29
Sinan Ünür Avatar answered Sep 23 '22 19:09

Sinan Ünür