Domanda

Is their a way to convert levenstein distances to error rates?

With the error rate being the fraction of the sequence that is not exactly the same.

È stato utile?

Soluzione

You mean you want to normalize Levenshtein distance to [0, 1]? That's

d(a,b) / max(len(a), len(b))

The denominator is an upper bound on Levenshtein distance, so this gives a figure between zero and one. Proof: assume (without loss of generality) that len(a) > len(b), then you can always transform a into b by substituting len(b) characters and deleting len(a) - len(b) of them, for a total of len(a) - len(b) + len(b) = len(a) operations.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top