Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I dump embedded ICC profile information in PDF? (command line or GUI tools)

It there a command line or GUI tools to dump information about ICC Profile/color conversion, which are set "Color management and PDF/X options for PDF" option of Illustrator's PDF export dialog?


"Color management and PDF/X options for PDF" option of Illustrator

[image] http://blogs.adobe.com/vikrant/files/2012/05/grayscale_export.png

[manual] http://help.adobe.com/en_US/illustrator/cs/using/WS714a382cdf7d304e7e07d0100196cbc5f-6547a.html#WS714a382cdf7d304e7e07d0100196cbc5f-6540a

like image 867
kaorukobo Avatar asked Jul 19 '13 07:07

kaorukobo


1 Answers

Here is a command line based method to extract ICC color profiles from a PDF. It uses the Python script pdf-parser.py written by security researcher Didier Stevens which you can download here.

However, this tool is not a specialized tool for ICC extraction. (I do not know such a tool.) It is a generic command line tool to investigate PDF files.

Therefor you need to go through various steps in order to achieve the extraction.

Step 1: Determine the PDF object ID of the ICC profile

You have to use -s to search for the string ICCBased. (PDF files without an embedded ICC profile will not have this keyword [with the exception of possibly using it in their text contents...].)

pdf-parser -s ICCBased my.pdf

My test PDF returned this:

obj 18 0
 Type: 
 Referencing: 21 0 R

It seems that an ICC profile is to be found in PDF object 21.

Step 2: Look at the PDF object found in step 1

You have to use -o 21 to see what PDF object 21 is:

pdf-parser.py -o 21 my.pdf

My test PDF returns this:

obj 21 0
 Type: 
 Referencing: 
 Contains stream

  <<
    /Alternate /DeviceRGB
    /Filter /FlateDecode
    /Length 2574
    /N 3
  >>

Ok, this looks like we are getting close...

Step 3: Dump the stream contained in the PDF object containing the profile

In step 2 we acquired two important infos:

  • The PDF object 21 contains a stream (the contents of which are not shown by using the -o 21 parameter of pdf-parser.py).
  • The object stream has to be de-compressed with the /FlateDecode in order to get to its content.

Hence we have to run pdf-parser.py now with two additional arguments:

  • -d filename in order to dump the stream of PDF object 21 to a file.
  • -f in order to filter/un-compress the object stream when dumping it to a file.
  • Command to run: pdf-parser.py -o 21 -f -d 21.stream my.pdf

Step 4: Verify what was extracted

We now have dumped the stream of PDF object 21 to a file named 21.stream. Let's see what it contains:

file 21.stream
 21.stream: Microsoft ICM Color Profile

Looks like we succeeded. :-)

Step 5: Open the color profile

I'll see if my Mac OSX system does accept this profile:

mv 21.stream 21.icm
open 21.icm

OSX uses the 'Color Sync Utility' to open the file and display a window. Clicking on the list entries opens different information panes at the bottom of the window:

Mac OSX  'Color Sync Utility' showing various infos about the extracted ICM profile.

Step 6: Use Argyll's iccdump to dump the contents of the ICC profile as text

Note, that Graeme Gill's ArgyllCMS, the open source color management software, available for Linux, Mac OSX and Windows, ships with a whole suite of command line tools. One of these is iccdump. We can use it to look at the properties of the newly won 21.icm file:

iccdump 21.icm

icc:
Header:
  size         = 3144 bytes
  CMM          = 'Lino'
  Version      = 2.1.0
  Device Class = Display
  Color Space  = RGB
  Conn. Space  = XYZ
  Date, Time   = 9 Feb 1998, 6:49:00
  Platform     = Microsoft
  Flags        = Not Embedded Profile, Use anywhere
  Dev. Mnfctr. = 'IEC '
  Dev. Model   = 'sRGB'
  Dev. Attrbts = Reflective, Glossy
  Rndrng Intnt = Perceptual
  Illuminant   = 0.964203, 1.000000, 0.824905    [Lab 100.000000, 0.000498, -0.000436]
  Creator      = 'HP  '

tag 0:
  sig      'cprt'
  type     'text'
  offset   336
  size     51
Text:
  No. chars = 43
    0x0000: Copyright (c) 1998 Hewlett-Packard Company

tag 1:
  sig      'desc'
  type     'desc'
  offset   388
  size     108
TextDescription:
  ASCII data, length 18 chars:
    0x0000: sRGB IEC61966-2.1
  No Unicode data
  ScriptCode Data, Code 0x0, length 18 chars
    0x0000: 73 52 47 42 20 49 45 43 36 31 39 36 36 2d 32 2e 31 00 

tag 2:
  sig      'wtpt'
  type     'XYZ '
  offset   496
  size     20
XYZArray:
  No. elements = 1

tag 3:
  sig      'bkpt'
  type     'XYZ '
  offset   516
  size     20
XYZArray:
  No. elements = 1

tag 4:
  sig      'rXYZ'
  type     'XYZ '
  offset   536
  size     20
XYZArray:
  No. elements = 1

tag 5:
  sig      'gXYZ'
  type     'XYZ '
  offset   556
  size     20
XYZArray:
  No. elements = 1

tag 6:
  sig      'bXYZ'
  type     'XYZ '
  offset   576
  size     20
XYZArray:
  No. elements = 1

tag 7:
  sig      'dmnd'
  type     'desc'
  offset   596
  size     112
TextDescription:
  ASCII data, length 22 chars:
    0x0000: IEC http://www.iec.ch
  No Unicode data
  ScriptCode Data, Code 0x0, length 22 chars
    0x0000: 49 45 43 20 68 74 74 70 3a 2f 2f 77 77 77 2e 69 65 63 2e 63 68 00 

tag 8:
  sig      'dmdd'
  type     'desc'
  offset   708
  size     136
TextDescription:
  ASCII data, length 46 chars:
    0x0000: IEC 61966-2.1 Default RGB colour space - sRGB
  No Unicode data
  ScriptCode Data, Code 0x0, length 46 chars
    0x0000: 49 45 43 20 36 31 39 36 36 2d 32 2e 31 20 44 65 66 61 75 6c 74 20 
...

tag 9:
  sig      'vued'
  type     'desc'
  offset   844
  size     134
TextDescription:
  ASCII data, length 44 chars:
    0x0000: Reference Viewing Condition in IEC61966-2.1
  No Unicode data
  ScriptCode Data, Code 0x0, length 44 chars
    0x0000: 52 65 66 65 72 65 6e 63 65 20 56 69 65 77 69 6e 67 20 43 6f 6e 64 
...

tag 10:
  sig      'view'
  type     'view'
  offset   980
  size     36
Viewing Conditions:
  XYZ value of illuminant in cd/m^2 = 19.644501, 20.371796, 16.808899
  XYZ value of surround in cd/m^2   = 3.928894, 4.074387, 3.361786
  Illuminant type = D50

tag 11:
  sig      'lumi'
  type     'XYZ '
  offset   1016
  size     20
XYZArray:
  No. elements = 1

tag 12:
  sig      'meas'
  type     'meas'
  offset   1036
  size     36
Measurement:
  Standard Observer = 1931 Two Degrees
  XYZ for Measurement Backing = 0.000000, 0.000000, 0.000000    [Lab 0.000000, 0.000000, 0.000000]
  Measurement Geometry = Unknown
  Measurement Flare =   1.0%
  Standard Illuminant = D65

tag 13:
  sig      'tech'
  type     'sig '
  offset   1072
  size     12
Signature
  Technology = Cathode Ray Tube Display

tag 14:
  sig      'rTRC'
  type     'curv'
  offset   1084
  size     2060
Curve:
  No. elements = 1024

tag 15:
  sig      'gTRC'
  type     'curv'
  offset   1084
  size     2060
Curve:
  No. elements = 1024

tag 16:
  sig      'bTRC'
  type     'curv'
  offset   1084
  size     2060
Curve:
  No. elements = 1024

P.S.:
ArgyllCMS contains a command line tool, extracticc, which can extract an embedded ICC profile from a TIFF file. It does not have a tool to extract a profile from a PDF file.

like image 116
Kurt Pfeifle Avatar answered Oct 20 '22 10:10

Kurt Pfeifle