ikegami is correct, but he didn't really explain what's wrong. To quote the docs for XML::LibXML::Document:
IMPORTANT: unlike toString for other nodes, on document nodes this function returns the XML as a byte string in the original encoding of the document (see the actualEncoding() method)!
(serialize
is just an alias for toString
)
When you print a byte string to a filehandle marked with an :encoding
layer, it gets encoded as if it were ISO-8859-1. Since you have a string containing UTF-8 bytes, it gets double encoded.
As ikegami said, use binmode(STDOUT)
to remove the encoding layer from STDOUT. You could also decode
the result of serialize
back into characters before printing it, but that assumes the document is using the same encoding you have set on your output filehandle. (Otherwise, you'll emit a XML document whose actual encoding doesn't match what its header claims.) If you're printing to a file instead of STDOUT, open it with '>:raw'
to avoid double encoding.