Question

Is there a way to compare two geometric shapes (or any two more generic data structures), without using the brute force when a tolerance is involved?

The brute force (that is comparing each value of each object against each value of the other object) works but it's slow, and I can't use it.

I tried sorting the data and comparing two sorted collections. It's fast, but it only works with zero tolerance. As soon as I add the tolerance I get lost. The problem is that two values can be identical when I compare and different when I sort.

Here are some details of my problem.

In my Excel VBA add-in I have a collection of Shape objects made by a collection of Line objects made by two Point objects each. The add-in scans a CAD drawing via COM and creates the collection of Shape objects.

An simplified version could generate this:

            Shape 1     Shape 2
Point 1    0.0  5.0    0.0  4.9
Point 2    4.9  0.0    5.1  0.0
Point 3    5.0  5.0    5.0  5.0

I need to find which shapes are identical to which shapes, where identical means has the same shape, size and orientation, but not the same position (so far it's trivial) plus or minus a tolerance (not so trivial now!)

The Point.IsIdenticalTo(OtherPoint) is defined as:

Function IsIdenticalTo(OtherPoint As Point) As Boolean
  IsIdenticalTo = Abs(X - OtherPoint.X) < Tolerance And Abs(Y - OtherPoint.Y) < Tolerance
End Function

The brute force implementation of the Shape.IsIdenticalTo(OtherShape) works but it's too slow: if each Line(I) has an identical OtherShape.Line(J) and viceversa, then the two shapes are identical. Sometimes I have hundreds of shapes with hundreds of lines each, so the brute force solution doesn't work for me.

I tried two approaches involving sorted collections. Both are fast because comparing two sorted collections is faster than the brute force way, but both fail in some conditions:

  1. A SortedValues collection contains all the X and Y values of all the lines. The values are sorted, so the info about whether a value is an X or a Y is lost. I have used this approach for months without problems, but it fails for example when the only difference between two shapes is between the points (10, 20) and (20, 10). I added the line angle to the list of values, things have improved, but there are still cases where this approach fails, because some info is lost when the values are sorted together. In the example above this approach would work with the following collections:

    Shape 1     Shape 2
      0.0         0.0
      0.0         0.0
      4.9         4.9
      5.0         5.0
      5.0         5.0
      5.0         5.1
    
  2. A SortedLines collection contains all the lines sorted counter-clockwise and starting from the point closest to the origin. This approach doesn't lose any info, but it fails in the example above because the sorting algorithm doesn't agree with the equality comparison. If the tolerance is 0.5 they should be identical, but the sorting algorithm produces the collections shown below. Things get more difficult because my shapes contain sub-shapes, so there are many starting points on each shape.

            Shape 1     Shape 2
    Point 1    4.9  0.0    0.0  4.9
    Point 2    5.0  5.0    5.1  0.0
    Point 3    0.0  5.0    5.0  5.0
    

EDIT:

Shapes are imported from an external graphical application via COM. A shape can be as simple as rectangle or as complex as any fancy outline with 10 circles inside, 20 internal shapes and 30 lines. They represent panels with holes and simple decorations, and sometimes they have a saw-tooth shape, which makes dozen of edges.

Was it helpful?

Solution

  1. handle shape as polygon

    convert your points (each line) to set of lines (length,angle) like on this image:

    polygon representation

    this ensures invariance on rotation/translation. If you see more lines with angle=PI join them together to avoid miss comparisons of the same shapes with different sampling also try to match the same CW/CCW polygon winding rule for both shapes.

  2. find start point

    Can be biggest or smallest angle, length ... or specific order of angles+lengths. So reorder lines of one polygon (cyclic shift) so your shapes are compared from the 'same point' if they can.

  3. comparison - for exact match

    • number of lines have to be the same
    • perimeters must be the same +/- some accuracy

    so for example:

    fabs (sum of all lengths of poly1 - sum of all lengths of poly2) <= 1e-3
    

    if not shapes are different. Then compare all lengths and angles. If any one value differs more then accuracy value then shapes are different.

  4. comparison - size does not matter

    compute perimeter of both polygons l1,l2 and resize all lengths of compared poly2 to match perimeter of poly1 so all lengths of poly2 are multiplied by value = l1/l2;. After this use comparison from bullet #3

  5. comparison - shape deviations can still do positive match (size must be the same)

    try to set the number of lines to the same value (join all lines with angle close to PI). Then perimeters should "match" ... fabs(l1-l2)<=1e-3*l1. You can use bullet #4 comparison

  6. comparison - size and shape deviations can still match

    just resize poly2 to match perimeter of poly1 as in bullet #4 and then use bullet #5

If you can not find the start point in booth polygons (bullet #2)

Then you have to check for all start point shifts so if your polygons have booth 5 lines:

    poly1: l11,l12,l13,l14,l15
    poly2: l21,l22,l23,l24,l25

Then you have to compare all 5 combinations (unless you found match sooner):

    cmp (l11,l12,l13,l14,l15),(l21,l22,l23,l24,l25)
    cmp (l11,l12,l13,l14,l15),(l22,l23,l24,l25,l21)
    cmp (l11,l12,l13,l14,l15),(l22,l23,l24,l25,l21)
    cmp (l11,l12,l13,l14,l15),(l23,l24,l25,l21,l22)
    cmp (l11,l12,l13,l14,l15),(l24,l25,l21,l22,l23)
    cmp (l11,l12,l13,l14,l15),(l25,l21,l22,l23,l24)

[Notes]

  1. There are also faster ways to compare but they can miss in some cases

    • you can compare histograms of lines, angles
    • you can use neural network (I do not like them but they are ideal for classifications like this)
  2. if your shapes have to be oriented in the same ways (no rotation invariance)

    then instead of vertex angle use the line direction angle

  3. if you can not ensure the same winding rule for both compared polygons

    then you have to check them booth:

    cmp (l11,l12,l13,l14,l15),(l21,l22,l23,l24,l25)
    cmp (l11,l12,l13,l14,l15),(l25,l24,l23,l22,l21)
    

I know it is a bit vague answer but still hope it helps at least a little ...

OTHER TIPS

I am not sure how do you want to solve this problem. you want to go deep or you just want a solution. I can suggest you use an OpenCV function called "matchShapes". This function is based on Hu moments and has a good performance for rigid shapes. After you extract the target and the main contours then use the below code to compare them.

dif = cv.matchShapes(Contour1, Contour2, 1, 0, 0)

Smaller "dif" value means more similarity between contours.

I have the same problem. I compute the adjacent matrix of the vertex weighted with the distances. This compute all the sides length and diagonals. Then if the module of each row or column of the matrix are the same with the other matrix, then the two shapes are the same. For the tolerance just use the function round() before start. The complexity is O(n2 / 2), because you have to compute just an half of the adjacent matrix that is symmetric. The problem is that I cannot detect if a shape is flipped.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top