Question

I have an image that has a regions which are non white (ex. paragraph but not dealing with OCR). The space between these regions somewhat regular, a person looking at the image will be able to see there are white spaces between these regions.

What I plan to do is find the top and bottom corners of all regions, start from the bottom corners to the next region's top corners, take entropy of each horizontal line, and the line with the lowest value and return that line's Y position. enter image description here

[region] <--- maximum corner coordinates identified
[line with lowest entropy] <--- return Y position starting from above region's bottom corner's Y coordinate.
[region]<--- stop at Y coordinate of this region's top corner.

What I intend to do is crop out these regions.

Another approach I thought of was using a histogram to identify the lowest points and somehow find the position of that lowest bar.

Was it helpful?

Solution

I'm not sure whether it's what you are looking for (i'm not sure what you are looking for), so if i'm wrong please write more details and i will try to update my answer. Right now i think that you are looking for white regions, which are best for splitting papers, because you don't cut anything important.

The easiest to implement solution is just calculate sum of each row and the next row and check whether difference of those values is 0 (or some other small value). Here is a simple code:

Mat m = imread(pathToFile);
cvtColor(m, m, CV_BGR2GRAY); //just to make sure
for (int i = 0; i < m.rows - 1; i++)
{
    Scalar s = sum(Mat(m, Rect(0, i, m.cols - 1, 1)));
    Scalar s2 = sum(Mat(m, Rect(0, i + 1, m.cols - 1, 1)));
    Scalar s3 = s - s2;
    if ((int)s3[0] == 0)
        printf("Empty line: %d\n", i);
}

In fact - you should also check whether this line is white or maybe you have just found 2 very similar non-white lines - so just add to this code some test like if ((int)s[0] < someValue) {//it's ok} else {//it's bad}. Of course it's not very efficient solution, because you have to compute sum of each (almost each) row twice and it's waste of time. Faster solution will be to remember sum of row in variable or maybe even put all sums in vector/array/etc if you want to use them later.

The most efficient way to calculate this is probably using integral images - calculate sum of whole image and than substract last element of i row from last element of i+1 row. Of course integral images are implemented in openCV - see here

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top