You can use the label
function from scipy.ndimage
to identify the distinct buildings.
Here's your example array, containing two buildings:
In [57]: a
Out[57]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 1, 1],
[0, 1, 0, 1, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0]])
Import label
.
In [58]: from scipy.ndimage import label
Apply label
to a
. It returns two values: the array of labeled positions, and the number of distinct objects (buildings, in this case) found.
In [59]: lbl, nlbls = label(a)
In [60]: lbl
Out[60]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 2, 2],
[0, 1, 0, 1, 0, 0, 2],
[0, 0, 0, 0, 0, 0, 0]], dtype=int32)
In [61]: nlbls
Out[61]: 2
To get the coordinates of a building, np.where
can be used. For example,
In [64]: np.where(lbl == 2)
Out[64]: (array([2, 2, 3]), array([5, 6, 6]))
It returns a tuple of arrays; the k
th array holds the coordinates of the k
th dimension. You can use, for example, np.column_stack
to combine these into an array:
In [65]: np.column_stack(np.where(lbl == 2))
Out[65]:
array([[2, 5],
[2, 6],
[3, 6]])
You might want a list of all the coordinate arrays. Here's one way to create such a list.
For convenience, first create a list of labels:
In [66]: labels = range(1, nlbls+1)
In [67]: labels
Out[67]: [1, 2]
Use a list comprehension to create the list of coordinate arrays.
In [68]: coords = [np.column_stack(where(lbl == k)) for k in labels]
In [69]: coords
Out[69]:
[array([[1, 2],
[2, 1],
[2, 2],
[2, 3],
[3, 1],
[3, 3]]),
array([[2, 5],
[2, 6],
[3, 6]])]
Now your building data is in labels
and coords
. For example, the first building was labeled labels[0]
, and its coordinates are in coords[0]
:
In [70]: labels[0]
Out[70]: 1
In [71]: coords[0]
Out[71]:
array([[1, 2],
[2, 1],
[2, 2],
[2, 3],
[3, 1],
[3, 3]])