Domanda

I have a lots of scans of text pages (black text on white background).

My usual approach is to clean those in Gimp using the Curves dialog using a pretty simple curve with only four points: 0,0 - 63,0 - 224,255, 255,255

This makes all the greyish text pitch black plus makes the text sharper and turns most of the whitish pixels pure white.

How can I achieve the same effect in a script using ImageMagick or some other Linux tool that runs completely from the command line?

-normalize or -contrast-stretch don't work because they operate with pixel counts. I need an operator which can make the colors 0-63 (grayscale) pitch black, everything above 224 pure white and the rest should be normalized.

È stato utile?

Soluzione

The Color Modifications page shows many color manipulation algorithms by ImageMagick.

In this specific case, two algorithms are interesting:

-level gives you perfect black/white pixels near the ends of the curve and a linear distribution between.

The sigmoidal option creates a smoother curve between the extremes, which works better for color photos.

To get a similar result like in GIMP, you can try to apply one after the other (to make text and black areas really black).

In all cases, you will want to run -normalize first (or even -contrast-stretch to merge most of the noise) to make sure no black/white levels are wasted. Without this, the darkest color could be lighter than rgb(0,0,0) and the brightest color could be below pure white.

Altri suggerimenti

[magick-users] Curves in ImageMagick

The first link in that archived message is a shell script that I think does what you're looking for.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top