Question

I am using the difflib.HtmlDiff class, calling the function using two sets of text (HTML from websites), however when it makes the table

html_diff = difflib.HtmlDiff()
print html_diff.make_table(previous_contents, fetch_url.page_contents)

however that just seems to compare char by char (1 char per table row), and I end up with a 4.3MB txt file for two sets of html which are only 100k.

The doc file says,

Compares fromlines and tolines (lists of strings) and returns a string which is a 
complete HTML file containing a table showing line by line differences with 
inter-line and intra-line changes highlighted.

however that doesn't seem to be the case.

Any suggestions?

Was it helpful?

Solution

You're supplying strings, not lists of strings (lines).

Assuming UNIX or Windows line ends:

print html_diff.make_table(previous_contents.split('\n'),
                           fetch_url.page_contents.split('\n'))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top