Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting PDF to CMYK (with identify recognizing CMYK)

I am having much trouble to get ImageMagick's identify to, well, identify a PDF as CMYK.

Essentially, let's say I'm building this file, test.tex, with pdflatex:

\documentclass[a4paper,12pt]{article}  %% https://tex.stackexchange.com/questions/13071 \pdfcompresslevel=0  %% http://compgroups.net/comp.text.tex/Making-a-cmyk-PDF %% ln -s /usr/share/color/icc/sRGB.icm . % \immediate\pdfobj stream attr{/N 4} file{sRGB.icm} % \pdfcatalog{% % /OutputIntents [ << % /Type /OutputIntent % /S/GTS_PDFA1 % /DestOutputProfile \the\pdflastobj\space 0 R % /OutputConditionIdentifier (sRGB IEC61966-2.1) % /Info(sRGB IEC61966-2.1) % >> ] % }  %% http://latex-my.blogspot.com/2010/02/cmyk-output-for-commercial-printing.html %% https://tex.stackexchange.com/questions/9961 \usepackage[cmyk]{xcolor}  \begin{document} Some text here... \end{document} 

If I then try to identify the resulting test.pdf file, I get it as RGB, no matter what options I've tried (at least according to the links in the source) - and yet, the colors in it would be saved as CMYK; for the source above:

$ grep -ia 'cmyk\|rgb\| k' test.pdf  0 0 0 1 k 0 0 0 1 K 0 0 0 1 k 0 0 0 1 K 0 0 0 1 k 0 0 0 1 K 0 0 0 1 k 0 0 0 1 K FontDirectory/CMR12 known{/CMR12 findfont dup/UniqueID known{dup /PTEX.Fullbanner (This is pdfTeX, Version 3.1415926-1.40.11-2.2 (TeX Live 2010) kpathsea version 6.0.0)  $ identify -verbose 'test.pdf[0]' ...   Type: Palette   Endianess: Undefined   Colorspace: RGB   Depth: 16/8-bit   Channel depth:     red: 8-bit     green: 8-bit     blue: 8-bit   Channel statistics:     Red: ...     Green: ...     Blue: ...   Histogram:          5: (12593,11565,11822) #31312D2D2E2E rgb(49,45,46)          4: (16448,15420,15677) #40403C3C3D3D rgb(64,60,61)          9: (20303,19275,19532) #4F4F4B4B4C4C rgb(79,75,76)         25: (23901,23130,23387) #5D5D5A5A5B5B rgb(93,90,91) ... 

The same pretty much happens if I also uncomment that \immediate\pdfobj stream ... part; and yet, if there is only one color (black) in the document, I don't see where does identify come up with a histogram of RGB values (although, arguably, all of them close to gray) ?!

 

So nevermind this, then I though I'd better try to use ghostscript to convert the test.pdf into a new pdf, which would be recognized as CMYK by identify - but no luck even there:

$ gs -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pdfwrite  -sOutputFile=test-gs.pdf -dUseCIEColor -sProcessColorModel=DeviceRGB -dProcessColorModel=/DeviceCMYK -sColorConversionStrategy=/CMYK test.pdf   GPL Ghostscript 9.01 (2011-02-07) Copyright (C) 2010 Artifex Software, Inc.  All rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for details. Processing pages 1 through 1. Page 1   $ identify -verbose 'test-gs.pdf[0]' ...   Type: Grayscale   Base type: Grayscale   Endianess: Undefined   Colorspace: RGB   Depth: 16/8-bit ... 

So the only thing that identify perceived as a change, is Type: Grayscale (from previous Type: Palette); but otherwise it still sees an RGB colorspace!

Along with this, note that identify is capable of correctly reporting a CMYK pdf - see CMYK poster example: fitting pdf page size to (bitmap) image size? #17843 - TeX - LaTeX - Stack Exchange for a command line example of generating such a PDF file using convert and gs. In fact, we can execute:

convert test.pdf -depth 8 -colorspace cmyk -alpha Off test-c.pdf 

... and this will result with a PDF that will be identifyed as CMYK - however, the PDF will also be rasterized (default at 72 dpi).

EDIT: I have just discovered, that if I create an .odp presentation in OpenOffice, and export it to PDF; that PDF will by default be RGB, however, the following command (from ghostscript Examples | Production Monkeys):

# Color PDF to CMYK: gs -dSAFER -dBATCH -dNOPAUSE -dNOCACHE -sDEVICE=pdfwrite \ -sColorConversionStrategy=CMYK -dProcessColorModel=/DeviceCMYK \ -sOutputFile=output.pdf input.pdf 

... actually will produce a CMYK pdf, reported as such by identify (although, the black will be rich, not plain - on all four channels); however, this command will work only when the slide has an added image (apparently, it is the one triggering the color conversion?!)! Funnily, I cannot get the same effect from a pdflatex PDF.

 

So I guess my question can be asked two ways:

  • Are there any command-line conversion methods in Linux, that will convert an RGB pdf into a CMYK pdf while preserving vectors, which is recognized as such in identify (and will consequently build a correct histogram of CMYK colors)
  • Are there any other command-line Linux tools similar to identify, which would recognize use of CMYK colors correctly even in the original test.pdf from pdflatex (and possibly build a color histogram, based on an arbitrarily chosen PDF page, like identify is supposed to)?

Thanks in advance for any answers,
Cheers!

 

Some references:

  • adobe - Script (or some other means) to convert RGB to CMYK in PDF? - Stack Overflow
  • color - PDF colour model and LaTeX - TeX - LaTeX - Stack Exchange
  • color - Option cmyk for xcolor package does not produce a CMYK PDF - TeX - LaTeX - Stack Exchange
  • Making a cmyk PDF - comp.text.tex | Computer Group
  • colormanagement with ghostscript ? - Rhinocerus:

    Is it for instance specified as "0 0 0 1 setcmykcolor"? Or possibly rather as "0 0 0 setrgbcolor"? In the latter case you would end up with a rich black for text, if DeviceRGB is remapped to a CIE-based color space in order to get RGB images color managed.

like image 375
sdaau Avatar asked Jun 05 '11 06:06

sdaau


People also ask

How do I convert a document to CMYK?

To create a new CMYK document in Photoshop, go to File > New. In the New Document window, simply switch the color mode to CMYK (Photoshop defaults to RGB). If you're wanting to convert an image from RGB to CMYK, then simply open the image in Photoshop. Then, navigate to Image > Mode > CMYK.

Are pdfs always CMYK?

1 Correct answer. PDF files are not RGB or CMYK - every page object can have whatever color space it wants, so the text may be CMYK, the images RGB, and the background a spot color. That's why there's no single statement of "color mode" anywhere on the document properties.


2 Answers

sdaau, the command you used for trying to convert your PDF to CMYK was not correct. Try this one instead:

 gs \    -o test-cmyk.pdf \    -sDEVICE=pdfwrite \    -sProcessColorModel=DeviceCMYK \    -sColorConversionStrategy=CMYK \    -sColorConversionStrategyForImages=CMYK \     test.pdf  

Update

If color conversion does not work as desired and if you see a message like "Unable to convert color space to Gray, reverting strategy to LeaveColorUnchanged" then...

  1. your Ghostscript probably is a newer release from the 9.x version series, and
  2. your source PDF likely uses an embedded ICC color profile

In this case add -dOverrideICC to the command line and see if it changes the result as desired.


Update 2

To avoid JPEG artifacts appearing in the images (where there were none before), add:

-dEncodeColorImages=false 

into the command line.

(This is true for almost all GS PDF->PDF processing, not just for this case. Because GS by default creates a completely new file with newly constructed objects and a new file structure when asked to produce PDF output -- it doesn't simply re-use the previous objects, as a more "dumb" PDF processor like pdftk does {pdftk has other advantages though, don't misunderstand my statement!}. GS applies JPEG compression by default -- look at the current Ps2pdf documentation and search for "ColorImageFilter" to learn about more details...)

like image 176
Kurt Pfeifle Avatar answered Sep 23 '22 04:09

Kurt Pfeifle


I have an unrelated problem but I am also struggling with CMYK PDFs currently.

I wrote this little script here (it's called pdf2pdfx):

#!/bin/bash  gs \ -dPDFX \ -dBATCH \ -dNOPAUSE \ -dNOOUTERSAVE \ -sDEVICE=pdfwrite \ -sColorConversionStrategy=CMYK \ -dProcessColorModel=/DeviceCMYK \ -dPDFSETTINGS=/prepress \ -sOutputFile="${1%%.pdf}_X-3.pdf" \ PDFX_def.ps \ "$1" 

and my PDFX_def.ps contains the following (I removed the ICC profile and defined FOGRA39, this should be OK):

%! % $Id$ % This is a sample prefix file for creating a PDF/X-3 document. % Feel free to modify entries marked with "Customize".  % This assumes an ICC profile to reside in the file (ISO Coated sb.icc), % unless the user modifies the corresponding line below.  systemdict /ProcessColorModel known {   systemdict /ProcessColorModel get dup /DeviceGray ne exch /DeviceCMYK ne and } {   true } ifelse { (ERROR: ProcessColorModel must be /DeviceGray or DeviceCMYK.)=   /ProcessColorModel cvx /rangecheck signalerror } if  % Define entries to the document Info dictionary :  % /ICCProfile (/usr/share/color/icc/ISOcoated_v2_300_eci.icc) def  % Customize or remove.  [ /GTS_PDFXVersion (PDF/X-3:2002) % Must be so (the standard requires).   /Title (Title)                  % Customize.   /Trapped /False                 % Must be so (Ghostscript doesn't provide other).   /DOCINFO pdfmark  % Define an ICC profile :  currentdict /ICCProfile known {   [/_objdef {icc_PDFX} /type /stream /OBJ pdfmark   [{icc_PDFX} <</N systemdict /ProcessColorModel get /DeviceGray eq {1} {4} ifelse >> /PUT pdfmark   [{icc_PDFX} ICCProfile (r) file /PUT pdfmark } if  % Define the output intent dictionary :  [/_objdef {OutputIntent_PDFX} /type /dict /OBJ pdfmark [{OutputIntent_PDFX} <<   /Type /OutputIntent              % Must be so (the standard requires).   /S /GTS_PDFX                     % Must be so (the standard requires).   /OutputCondition (Commercial and specialty printing) % Customize   /Info (none)                     % Customize   /OutputConditionIdentifier (FOGRA39)      % Customize   /RegistryName (http://www.color.org)   % Must be so (the standard requires).   currentdict /ICCProfile known {     /DestOutputProfile {icc_PDFX}  % Must be so (see above).   } if >> /PUT pdfmark [{Catalog} <</OutputIntents [ {OutputIntent_PDFX} ]>> /PUT pdfmark 

Identify then correctly reports CMYK colorspace. Before:

tbart@blackknight ~/orpheus/werbung/action $ identify -verbose action_schulungsvideo_v3_print.pdf Image: action_schulungsvideo_v3_print.pdf   Format: PDF (Portable Document Format)   Class: DirectClass   Geometry: 612x859+0+0   Resolution: 72x72   Print size: 8.5x11.9306   Units: Undefined   Type: TrueColor   Endianess: Undefined   Colorspace: RGB   Depth: 16/8-bit   Channel depth:     red: 8-bit     green: 8-bit     blue: 8-bit   Channel statistics:     Red:       min: 0 (0)       max: 65535 (1)       mean: 53873.6 (0.822058)       standard deviation: 19276.7 (0.294144)       kurtosis: 1.854       skewness: -1.82565     Green:       min: 0 (0)       max: 65535 (1)       mean: 55385.6 (0.84513)       standard deviation: 19274.6 (0.294112)       kurtosis: 2.09868       skewness: -1.91651     Blue:       min: 0 (0)       max: 65535 (1)       mean: 51020 (0.778516)       standard deviation: 20077.7 (0.306367)       kurtosis: 0.860627       skewness: -1.52344   Image statistics:     Overall:       min: 0 (0)       max: 65535 (1)       mean: 53426.4 (0.815235)       standard deviation: 19546.7 (0.298263)       kurtosis: 1.59453       skewness: -1.75701   Rendering intent: Undefined   Interlace: None   Background color: white   Border color: rgb(223,223,223)   Matte color: grey74   Transparent color: black   Compose: Over   Page geometry: 612x859+0+0   Dispose: Undefined   Iterations: 0   Compression: Undefined   Orientation: Undefined   Properties:     date:create: 2011-09-14T15:38:57+02:00     date:modify: 2011-09-14T15:38:57+02:00     pdf:HiResBoundingBox: 612.283x858.898+0+0     pdf:Version: PDF-1.5      signature: 210bfc9cf90e3b9505385f8b2267da1665b5c2de28bb5223311afba01718bbeb   Artifacts:     verbose: true   Tainted: False   Filesize: 1.577MBB   Number pixels: 526KB   Pixels per second: 52.57MB   User time: 0.020u   Elapsed time: 0:01.009   Version: ImageMagick 6.6.5-6 2011-04-08 Q16 http://www.imagemagick.org 

after:

tbart@blackknight ~/orpheus/werbung/action $ pdf2pdfx action_schulungsvideo_v3_print.pdf GPL Ghostscript 9.04 (2011-08-05) Copyright (C) 2011 Artifex Software, Inc.  All rights reserved. This software comes with NO WARRANTY: see the file PUBLIC for details. Processing pages 1 through 1. Page 1   tbart@blackknight ~/orpheus/werbung/action $ identify -verbose action_schulungsvideo_v3_print_X-3.pdf  Image: action_schulungsvideo_v3_print_X-3.pdf   Format: PDF (Portable Document Format)   Class: DirectClass   Geometry: 612x859+0+0   Resolution: 72x72   Print size: 8.5x11.9306   Units: Undefined   Type: ColorSeparation   Base type: ColorSeparation   Endianess: Undefined   Colorspace: CMYK   Depth: 16/8-bit   Channel depth:     cyan: 8-bit     magenta: 8-bit     yellow: 8-bit     black: 8-bit   Channel statistics:     Cyan:       min: 0 (0)       max: 65535 (1)       mean: 8331.78 (0.127135)       standard deviation: 14902.2 (0.227392)       kurtosis: 1.62171       skewness: 1.7799     Magenta:       min: 0 (0)       max: 62194 (0.94902)       mean: 6739.34 (0.102836)       standard deviation: 14517.5 (0.221523)       kurtosis: 2.08183       skewness: 1.93276     Yellow:       min: 0 (0)       max: 65535 (1)       mean: 13310.1 (0.203098)       standard deviation: 17022.5 (0.259746)       kurtosis: 0.991135       skewness: 1.45216     Black:       min: 0 (0)       max: 56540 (0.862745)       mean: 7117.47 (0.108606)       standard deviation: 16803.7 (0.256408)       kurtosis: 3.02752       skewness: 2.16554   Image statistics:     Overall:       min: 0 (0)       max: 65535 (1)       mean: 8874.66 (0.135419)       standard deviation: 15850.6 (0.241864)       kurtosis: 2.17614       skewness: 1.88139   Total ink density: 292%   Rendering intent: Undefined   Interlace: None   Background color: white   Border color: cmyk(223,223,223,0)   Matte color: grey74   Transparent color: black   Compose: Over   Page geometry: 612x859+0+0   Dispose: Undefined   Iterations: 0   Compression: Undefined   Orientation: Undefined   Properties:     date:create: 2011-09-14T15:39:30+02:00     date:modify: 2011-09-14T15:39:30+02:00     pdf:HiResBoundingBox: 612.28x858.9+0+0     pdf:Version: PDF-1.3      signature: 0416db7487ea147b974ece5748bc4284e82bfc3fb7cd07a4de050421ba112076   Artifacts:     verbose: true   Tainted: False   Filesize: 2.103MBB   Number pixels: 526KB   Pixels per second: 5.25708PB   User time: 0.000u   Elapsed time: 0:01.000   Version: ImageMagick 6.6.5-6 2011-04-08 Q16 http://www.imagemagick.org 

This is on 64bit Gentoo with gs 9.04 Maybe that helps?

Source PDF stems from inkscape pdf export, colors were restricted to those covered in ECI ISO coated v2. I use this as a workaround for the lacking CMYK export of inkscape and the lacking prepress-ready PDF/X output...

like image 20
tbart Avatar answered Sep 23 '22 04:09

tbart