My understanding was that if a core point from a neighbor of core point is visited, it will be joined to the currently examined cluster, right?
Yes you're right and you can safely remove the line
if P' is not visited
However, this is not efficient.
If a point P' has been visited there is no need to compute its neighborhood and join it with the neighborhood of P.
It is visited means that: it's a noise point, it is already in a cluster or it's a border point. If it's already in a cluster and if it's a core point, this means that its neighbors have been processed already. If it's a border point then its neighbors must not be joined.
But how does the recursion happen here?
In the line
for each point P' in NeighborPts
you must consider NeighborPts
as a dynamic container of points. The first time you enter the for loop NeighborPts
contains X
points. If the joining adds Y
points to NeighborPts
then the for loop will visit both X
and Y
sets. This will then repeat for the sets X
and Y
and this is how the recursion happens.
Will the data collection region going to be revisited after the modification of the data collection through this code region.addAll(v);?
Yes, every time you call region.addAll(v)
, region.size()
increases and this confirms the recursion behavior that confuses you.