html5lib uses the HTML parsing algorithm as defined in the HTML spec, and as implemented in all major browsers. lxml uses libxml2's HTML parser — this is based on their XML parser, ultimately, and does not follow any error handling for invalid HTML used anywhere else.
Most web developers only test with web browsers — standards be damned — so if you want to get what the page's author intended, you'll likely need to use something like html5lib that matches current browsers,