سؤال

I am using difflib.HtmlDiff to compare two files. I want the differences to be highlighted in the outputted html.

This already works when there are a maximum of two different chars in one line:

a = "2.000"
b = "2.120"

But when there are more different characters on one line then in the output the whole line is marked red (on the left side) or green (on the right side of the table):

a = "2.000"
b = "2.123"

Is this behaviour configurable? So can I set the number of different characters at which the line is marked as deleted / added?

EDIT:

Example:

import difflib
diff=difflib.HtmlDiff()
print(diff.make_file(
'''
2.000
2.000
2.000
'''.splitlines(),
'''
2.001
2.010
2.011
'''.splitlines()))

Gives me this output:

output

Line 2 is the output I want. It highlights the differences in yellow. Line 3 is odd for me because it does not detect the one character change but instead shows it as delete / add. Line 4 same as for line 3 but the whole line is marked.

هل كانت مفيدة؟

المحلول

difflib's algorithm does not claim to yield minimal edit sequences. Although that statement comes from the docs for SequenceMatcher, I suspect it applies to difflib in general, and HTMLDiff in particular.

While googling around for "python alternative difflib minimal edit" I found google-diff-match-patch. If you try out their demo for Diff with your example strings, it yields

enter image description here

Although the output is not exactly what you requested, it does show that it found the minimal edits.

The API docs state

diff_prettyHtml(diffs) => html

Takes a diff array and returns a pretty HTML sequence. This function is mainly intended as an example from which to write ones own display functions.

which suggests looking at the source code for diff_prettyHtml might be a good starting point from which to build the HTML table you are looking for.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top