Why, that is completely valid HTML! However, you can decode the Entities using HTML::Entities
from CPAN.
use HTML::Entities;
...;
my $html = $response->decoded_content;
my $decoded_string = decode_entities($html);
The docs for HTTP::Response::decoded_content
state that the Content-encoding
and charsets are reversed, not HTML entities (which are a HTML/XML language feature, not really an encoding).
Edit:
However, as ikegami pointed out, decoding the entities immediately could render the HTML unparsable. Therefore, it might be best to parse the HTML first (e.g. using HTML::Tree
), and then only decoding the text nodes when needed.
use HTML::TreeBuilder;
my $url = ...;
my $tree = HTML::TreeBuilder->new_from_url($url); # invokes LWP automatically
my $decoded_text = decode_entities($tree->as_text); # dumps the tree as flat text, then decodes.