you might need to develop your own algorythms, i do that too with Aforge
Aforge basicly allows me for simple video aquisition, while my math does the interesting stuff.
In your case..
Detect spots with people
Zoom in to them ??
then it becommes tricky how to distinguish someone who dives from someone who sinks ?..
Also there are waves who can get in front of the person your trying to follow..
usually this recognition comes down to simple observations, like someone pulling his hands up is not like a circle of a swimming head..
you got to think how a beach guard can see the difference, what are the main visual clues and how can you convert them to bitmap math.