DPI of image extracted from PDF with pdfBox
Domanda
I'm using java pdfBox library to validate single page pdf files with embedded images.
I know that pdf file itself doesen't contain the DPI information.
However the images that have the equal dimensions in the document have different sizes in pixels after extracting and no dpi meta information.
So is it possible to somehow calculate the image sizes relative to pdf page or to extract images with their dpi information (for png or jpeg image files) using pdfBox?
Thanks!
Soluzione
Get the PrintImageLocations.java file from the PDFBOX src download. Here's an except of the source, only the last line is by me, and it will output the dpi:
float imageXScale = ctmNew.getXScale();
float imageYScale = ctmNew.getYScale();
System.out.println("position = " + ctmNew.getXPosition() + ", " + ctmNew.getYPosition());
// size in pixel
System.out.println("size = " + imageWidth + "px, " + imageHeight + "px");
// size in page units
System.out.println("size = " + imageXScale + "pu, " + imageYScale + "pu");
// size in inches
imageXScale /= 72;
imageYScale /= 72;
System.out.println("size = " + imageXScale + "in, " + imageYScale + "in");
// size in millimeter
imageXScale *= 25.4;
imageYScale *= 25.4;
System.out.println("size = " + imageXScale + "mm, " + imageYScale + "mm");
System.out.printf("dpi = %.0f dpi (X), %.0f dpi (Y) %n", image.getWidth() * 72 / ctmNew.getXScale(), image.getHeight() * 72 / ctmNew.getYScale());
And here's a sample output:
Found image [X0]
position = 0.0, 0.0
size = 2544px, 3523px <---- pixels
size = 610.56pu, 845.52pu <---- "page units", 1pu = 1/72 inch
size = 8.48in, 11.743334in
size = 215.39198mm, 298.28067mm
dpi = 300 dpi (X), 300 dpi (Y)
Altri suggerimenti
I am not familiar with pdfBox, but you has a CTM associated with every raster image in pdf. CTM gives one data about position and dimensions of image on the page. Thus and data from extracted images should be sufficient to calculate relative dpi.