Could you not do a connected component analysis(8 connectedness) on the output image based on the color / index information. I think that would give you the easiest starting point. All the pixels in the neighborhood which are in the same segment and you can store that information and further use it.
However I don't think you can do background/foreground segmentation based on watershed as that requires a little more information than just segments for each frame. You will have to somehow keep a track of previous frame information too. There are a lot of papers. Go through them. Grimmson's paper was seminal. I've seen a lot more these days. Pixel Based Adaptive Segmenter was the best but very slow, in my opinion.