It's hard to test without your data, but I would guess that this is due to Perl printing the file as an ISO-8859-1 file, since it doesn't have any information about its encoding (it gets it "raw" from XML::Parser). Try binmode STDOUT, ':utf8';
before printing.
Also, it may not be a great idea to read the file first and then pass a string to the parser. Using parsefile
(on the file name) is safer. You potentially avoid encoding problems.