There are 3 different forms of discovery:
- multicast. So we shout around on the network and try to find other members
- tcp/ip: we need to have a few well known members. If one or more of these well known members is online, other members can form a cluster.
- aws: we just log into aws, read out all the instances within a given region, apply some filtering, and what we remains are well known members. From that point on we rely on tcp/ip based clustering.
So this is in short how auto discovery works.
Detection node failure is done based on heart beats. So every x seconds we send a message to a all members in the clusters, the ones that can't reply, are eventually declared dead.