Here is a command line based method to extract ICC color profiles from a PDF. It uses the Python script pdf-parser.py written by security researcher Didier Stevens which you can download here.
However, this tool is not a specialized tool for ICC extraction. (I do not know such a tool.) It is a generic command line tool to investigate PDF files.
Therefor you need to go through various steps in order to achieve the extraction.
Step 1: Determine the PDF object ID of the ICC profile
You have to use -s
to search for the string ICCBased
. (PDF files without an embedded ICC profile will not have this keyword [with the exception of possibly using it in their text contents...].)
pdf-parser -s ICCBased my.pdf
My test PDF returned this:
obj 18 0
Type:
Referencing: 21 0 R
It seems that an ICC profile is to be found in PDF object 21.
Step 2: Look at the PDF object found in step 1
You have to use -o 21
to see what PDF object 21 is:
pdf-parser.py -o 21 my.pdf
My test PDF returns this:
obj 21 0
Type:
Referencing:
Contains stream
<<
/Alternate /DeviceRGB
/Filter /FlateDecode
/Length 2574
/N 3
>>
Ok, this looks like we are getting close...
Step 3: Dump the stream contained in the PDF object containing the profile
In step 2 we acquired two important infos:
- The PDF object 21 contains a stream (the contents of which are not shown by using the
-o 21
parameter ofpdf-parser.py
). - The object stream has to be de-compressed with the
/FlateDecode
in order to get to its content.
Hence we have to run pdf-parser.py
now with two additional arguments:
-d filename
in order to dump the stream of PDF object 21 to a file.-f
in order to filter/un-compress the object stream when dumping it to a file.- Command to run:
pdf-parser.py -o 21 -f -d 21.stream my.pdf
Step 4: Verify what was extracted
We now have dumped the stream of PDF object 21 to a file named 21.stream
. Let's see what it contains:
file 21.stream
21.stream: Microsoft ICM Color Profile
Looks like we succeeded. :-)
Step 5: Open the color profile
I'll see if my Mac OSX system does accept this profile:
mv 21.stream 21.icm
open 21.icm
OSX uses the 'Color Sync Utility' to open the file and display a window. Clicking on the list entries opens different information panes at the bottom of the window:
Step 6: Use Argyll's iccdump
to dump the contents of the ICC profile as text
Note, that Graeme Gill's ArgyllCMS, the open source color management software, available for Linux, Mac OSX and Windows, ships with a whole suite of command line tools. One of these is iccdump
. We can use it to look at the properties of the newly won 21.icm
file:
iccdump 21.icm icc: Header: size = 3144 bytes CMM = 'Lino' Version = 2.1.0 Device Class = Display Color Space = RGB Conn. Space = XYZ Date, Time = 9 Feb 1998, 6:49:00 Platform = Microsoft Flags = Not Embedded Profile, Use anywhere Dev. Mnfctr. = 'IEC ' Dev. Model = 'sRGB' Dev. Attrbts = Reflective, Glossy Rndrng Intnt = Perceptual Illuminant = 0.964203, 1.000000, 0.824905 [Lab 100.000000, 0.000498, -0.000436] Creator = 'HP ' tag 0: sig 'cprt' type 'text' offset 336 size 51 Text: No. chars = 43 0x0000: Copyright (c) 1998 Hewlett-Packard Company tag 1: sig 'desc' type 'desc' offset 388 size 108 TextDescription: ASCII data, length 18 chars: 0x0000: sRGB IEC61966-2.1 No Unicode data ScriptCode Data, Code 0x0, length 18 chars 0x0000: 73 52 47 42 20 49 45 43 36 31 39 36 36 2d 32 2e 31 00 tag 2: sig 'wtpt' type 'XYZ ' offset 496 size 20 XYZArray: No. elements = 1 tag 3: sig 'bkpt' type 'XYZ ' offset 516 size 20 XYZArray: No. elements = 1 tag 4: sig 'rXYZ' type 'XYZ ' offset 536 size 20 XYZArray: No. elements = 1 tag 5: sig 'gXYZ' type 'XYZ ' offset 556 size 20 XYZArray: No. elements = 1 tag 6: sig 'bXYZ' type 'XYZ ' offset 576 size 20 XYZArray: No. elements = 1 tag 7: sig 'dmnd' type 'desc' offset 596 size 112 TextDescription: ASCII data, length 22 chars: 0x0000: IEC http://www.iec.ch No Unicode data ScriptCode Data, Code 0x0, length 22 chars 0x0000: 49 45 43 20 68 74 74 70 3a 2f 2f 77 77 77 2e 69 65 63 2e 63 68 00 tag 8: sig 'dmdd' type 'desc' offset 708 size 136 TextDescription: ASCII data, length 46 chars: 0x0000: IEC 61966-2.1 Default RGB colour space - sRGB No Unicode data ScriptCode Data, Code 0x0, length 46 chars 0x0000: 49 45 43 20 36 31 39 36 36 2d 32 2e 31 20 44 65 66 61 75 6c 74 20 ... tag 9: sig 'vued' type 'desc' offset 844 size 134 TextDescription: ASCII data, length 44 chars: 0x0000: Reference Viewing Condition in IEC61966-2.1 No Unicode data ScriptCode Data, Code 0x0, length 44 chars 0x0000: 52 65 66 65 72 65 6e 63 65 20 56 69 65 77 69 6e 67 20 43 6f 6e 64 ... tag 10: sig 'view' type 'view' offset 980 size 36 Viewing Conditions: XYZ value of illuminant in cd/m^2 = 19.644501, 20.371796, 16.808899 XYZ value of surround in cd/m^2 = 3.928894, 4.074387, 3.361786 Illuminant type = D50 tag 11: sig 'lumi' type 'XYZ ' offset 1016 size 20 XYZArray: No. elements = 1 tag 12: sig 'meas' type 'meas' offset 1036 size 36 Measurement: Standard Observer = 1931 Two Degrees XYZ for Measurement Backing = 0.000000, 0.000000, 0.000000 [Lab 0.000000, 0.000000, 0.000000] Measurement Geometry = Unknown Measurement Flare = 1.0% Standard Illuminant = D65 tag 13: sig 'tech' type 'sig ' offset 1072 size 12 Signature Technology = Cathode Ray Tube Display tag 14: sig 'rTRC' type 'curv' offset 1084 size 2060 Curve: No. elements = 1024 tag 15: sig 'gTRC' type 'curv' offset 1084 size 2060 Curve: No. elements = 1024 tag 16: sig 'bTRC' type 'curv' offset 1084 size 2060 Curve: No. elements = 1024
P.S.:
ArgyllCMS contains a command line tool, extracticc
, which can extract an embedded ICC profile from a TIFF file. It does not have a tool to extract a profile from a PDF file.