It sounds like you are trying to determine whether multiple collections of time/space coordinates (a.k.a "tracks") are likely to correspond to the same object. This is known in some circles as "object tracking", for which there is a fair amount of literature (e.g., Object Tracking: A Survey - CRCV - University of Central Florida). This document points to other literature, which may explain various algorithms for predicting future locations of the objects.
I think what you want to do is to extrapolate the known points in time/space such that you can compare points at given times. (Euclidean distance may be fine.) In your distance function, you probably want to weight the comparison of "predicted" locations less than you would for your comparison of actual (captured) locations.
I hope I didn't misinterpret your intent.