Intersecting rectangles can be viewed as connected nodes in a graph, and sets of "transitively" intersecting rectangles as Connected Components. To find out which rectangles intersect, we first do a Plane Sweep. To make this reasonably fast we need an Interval Tree. Banyan provides one:
from collections import defaultdict
from itertools import chain
from banyan import SortedDict, OverlappingIntervalsUpdator
def closed_regions(rects):
# Sweep Line Algorithm to set up adjacency sets:
neighbors = defaultdict(set)
status = SortedDict(updator=OverlappingIntervalsUpdator)
events = sorted(chain.from_iterable(
((r.left, False, r), (r.right, True, r)) for r in set(rects)))
for _, is_right, rect in events:
for interval in status.overlap(rect.vertical):
neighbors[rect].update(status[interval])
if is_right:
status.get(rect.vertical, set()).discard(rect)
else:
status.setdefault(rect.vertical, set()).add(rect)
# Connected Components Algorithm for graphs:
seen = set()
def component(node, neighbors=neighbors, seen=seen, see=seen.add):
todo = set([node])
next_todo = todo.pop
while todo:
node = next_todo()
see(node)
todo |= neighbors[node] - seen
yield node
for node in neighbors:
if node not in seen:
yield component(node)
rect.vertical
BTW is the tuple (rect.top, rect.bottom)
.
Time complexity is O(n log n + k)
, where n
is the number of rectangles and k
the number of actual intersections. So it's pretty close to optimal.
edit: Because there was some confusion, I need to add that the rectangles are expected to have left <= right
and top <= bottom
. IOW, the origin of the coordinate system they are in is in the upper left corner, not in the lower left corner as is usual in geometry.