Question

I want to be able to search some text and replace or remove it from the PDF document.

I tried pdf2ps. The resulting .ps document did not respond to grep or sed commands.

Are there any tools that can simply allow me to grep or sed or use a similar function?

  • xpdf did not allow me run its functions under Unix.
  • pdfedit needs a GUI, which I want to avoid using.

Does ghostscript allow PDF editing? If yes, what is the function to use?

Was it helpful?

Solution

From Wikipedia:

Poppler comes with a text-rendering back-end as well, which can be invoked from the command line utility pdftotext. It is useful for searching for strings in PDFs from the command line, using the utility grep, for instance.

This will not solve your whole problem, as you want to edit as well, but Poppler might be a lib to build a tool on, if there should be no such tool available. It seems to have functionality to handle the PDF format, which is not trivial:

The PDF combines three technologies:

  • A subset of the PostScript page description programming language, for generating the layout and graphics.
  • A font-embedding/replacement system to allow fonts to travel with the documents.
  • A structured storage system to bundle these elements and any associated content into a single file, with data compression where appropriate.

Source: Wikipedia

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top