Question

I have two systems, one RHEL5 and one Ubuntu 10.04, and they exhibit differing behavior. I'm using perl's XML::Simple to parse the response of a call to some SaaS software. The response is:

    <xml answer="{&quot;foo&quot;: &quot;bar&quot;}" />

The ubuntu system correctly returns {"foo": "bar"}, but the RHEL5 system leaves the quoted entities in the attribute tag, and I cannot seem to find the option to change this.

Yes, the XML::Simple versions are slightly different (and I cannot change that); RHEL5: 2.14, Ubuntu: 2.18. I'd love to solve this so that the behavior is consistent.

Was it helpful?

Solution

Delete the XML::SAX::PurePerl section from the file returned by

perl -MFile::Basename -E'say dirname($ARGV[0])."/SAX/ParserDetails.ini"' "`perldoc -l XML::SAX`"

The module is awful!

  • It's slow. And I mean CRAZY slow.
  • It can't doesn't handle encodings correctly.
  • And apparently, it doesn't handle entities correctly either.

If you want the best performance from XML::Simple, make sure to use

local $XML::Simple::PREFERRED_PARSER = 'XML::Parser';

Caveat: XML::Parser doesn't handle namespaces.

Note: XML::LibXML is still 17x faster than XML::Simple with XML::Parser.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top