Question

I'm looking for a way to diff two strings and return the index value of where the changes start and finish.

I'm already using diff-lcs to find out which lines have changed, but I need to figure out the positions of which characters have changed. I need the positions of the new characters so I can handle them with JavaScript, not the actual text, which is what most diff tools seem to give.

So, for example if I have this string:

The brown fox jumps over the lazy dog

and compare to this string:

The red fox jumps over the crazy dog

I would like to see something like:

[[5,8],[28,33]]

Those numbers being the position where the new characters are found.

Does anyone have any idea how I might get this done?

Was it helpful?

Solution

How about the Google diff-match-patch code? https://github.com/elliotlaster/Ruby-Diff-Match-Patch

I've used it in the past and been happy with the results.

Taken from the documentation linked above:

# Diff-ing
dmp.diff_main("Apples are a fruit.", "Bananas are also fruit.", false)
=> [[-1, "Apple"], [1, "Banana"], [0, "s are a"], [1, "lso"], [0, " fruit."]]

You would just need to iterate through the non-matches and find the character position in the appropriate string.

pos_ary = s.enum_for(:scan, /search_string/).map { regexp.last_match.begin(0) }
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top