문제

I have checked the docs of difflib and i'm confused on how difflib.SequenceMatcher.ratio() actually works. Consider this :

s = difflib.SequenceMatcher(None, "hey here" , "hey there").ratio()
print s 

gives s = 0.9411764705882353
I wanted to know how exactly it is computed . The 2 strings are compared by actually looking at the deviation of one string w.r.t other.for 2 strings a and b The docs say :

differences are computed as "what do we need to do to 'a' to change it into 'b'?"

And there is something like :

for x in b, b2j[x] is a list of the indices (into b) at which x appears; junk elements do not appear

Please explain w.r.t the above example of s .

도움이 되었습니까?

해결책

From the docs:

Where T is the total number of elements in both sequences, and M is the number of matches, this is 2.0*M / T.

In this case, T is 17 because the first string has 8 characters and the second string has 9. M is 8 because eight characters from the first string match with characters from the second. 2 * 8 / 17 equals 0.9411764705882353.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top