Question

As per my understanding,

1. .eps format images are vector images.
2. When we draw something in word (like a flowchart) that is stored 
as a vector image.  

I am almost sure about the first, not sure about the second. Please correct me if I am wrong.

Assuming this two things, when a latex file (where .eps images are inserted) or a word file (that contains vector images) is converted into pdf, do the images get converted into raster images?

Also, I think PDFBox/xpdf can only extract raster images from the pdf (as they are embedded as XObjects), not vector images. Is that understanding correct? This question in stackoverflow is related, but have not been answered yet.

No correct solution

OTHER TIPS

Your point 1 is incorrect, eps files are PostScript programs, they may contain vector information, or text or image data, or all of the above.

point 2 In PDF there isn't a 'vector image', an image means a bitmap and therefore cannot be vector.

If you convert a PostScript program to a PDF file, then the result depends entirely on the conversion program you use. In general vectors will be retained as vectors, and text as text. However it is entirely possible that an application might render the entire PostScript program and insert the result as an image in the PDF.

So the answer to your first question ("do the images get converted into raster images") is 'maybe, but probably not'.

I'm afraid I have no idea about the capabilities of PDFBox/xpdf, but since collections of vectors may not be arranged as 'images' (they could be held as Form XObjects, or Patterns) in any atomic fashion, there isn't any obvious way to know when to stop extracting. And what format would you store the result in anyway ?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top