Question

I'm trying to detect objects and text in a hand-drawn diagram.

My goal is to be able to "parse" something like this into an object structure for further processing. My first aim is to detect text, lines and boxes (arrows etc... are not important (for now ;))

I can do Dilatation, Erosion, Otsu thresholding, Invert etc and easily get to something like this

What I need some guidance for are the next steps. I've have several ideas:

  1. Contour Analysis
  2. OCR using UNIPEN
  3. Edge detection

Contour Analysis

I've been reading about "Contour Analysis for Image Recognition in C#" on CodeProject which could be a great way to recognize boxes etc. but my issue is that the boxes are connected and therefore do not form separate objects to match with a template. Therefore I need some advises IF this is a feasible way to go.

OCR using UNIPEN

I would like to use UNIPEN (see "Large pattern recognition system using multi neural networks" on CodeProject) to recognize handwritten letters and then "remove" them from the image leaving only the boxes and lines.

Edge detection Another way could be to detect all lines and corners and in that way infer the boxes and lines that the image consist of. In that case ideas on how to straighten the lines and find the 90 degree corners would be helpful.

Generally, I think I just need some pointers on which strategy to apply, not code samples (though it would be great ;))

Was it helpful?

Solution

I will try to answer about the contour analysis and the lines between them.

If you need to turn the interconnected boxes into separate objects, that can be achieved easily enough:

  1. close the gaps in the box edges with morphological closing
  2. perform connected components labeling and look for compact objects (e.g. objects whose area is close to the area of their bounding box)

You will get the insides of the boxes. These can be elliptical or rectangular or any shape you may find in common diagrams, the contour analysis can tell you which. A problem may arise for enclosed background areas (e.g. the space between the ABC links in your example diagram). You might eliminate these on the criterion that their bounding box overlaps with multiple other objects' bounding boxes.

Now find line segments with HoughLinesP. If a segment finishes or starts within a certain distance of the edge of one of the objects, you can assume it is connected to that object.

As an added touch you could try to detect arrow ends on either side by checking the width profile of the line segments in a neighbourhood of their endpoints.

It is an interesting problem, I will try to remember it and give it to my students to grit their teeth on.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top