Question

I have started working on online signature data-set for verification purpose. I have two matrices containing digitized data of two signatures of varying length (the number of rows differ). e.g. one is 177×7 and second is 170×7.

I want to treat each column as one time series and I'd like to compare one time series of a signature with the corresponding time series of second signature.

How should I align the two time series?

Was it helpful?

Solution

I think this question really belongs on Math.StackExchange, but I will do my best to answer it here. The short answer is that the Euclidean distance cannot be applied in this case and you will need to define your own notion of distance. This may or may not actually be feasible.

The notion of distance relies on the existence of a "metric" defined on the space of interest. If your vectors are of different lengths then traditional metrics (including the Euclidean distance) are ill-defined and you need to define a new metric that works for you.

There are two things you'll need to do here:

  1. Define the space you're working with. This seems to be the set of vectors of length 177 or length 170. This is a very unusual set.
  2. Define your metric (and ensure that it actually meets all the properties of a metric).

The most obvious solution is to project vectors of length 177 into the space of vectors of length 170 and then compute the Euclidean distance as usual. For example, you could just ignore the last 7 elements of the vector. Note that this is not a metric on your original set as it violates the condition ( d(x,y)=0 iff x=y ), but it is a metric on the projected vectors. There may be a clever solution on the original set, but there is not an obvious one. Again, the people on Math.StackExchange may be able to help you more.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top