Question

XML, Why are null char disallowed even in CDATA sections?

It seems to terminate the file right there.

Any solution? Base64?

Was it helpful?

Solution

You might find your answer in this previous question:

Why are "control" characters illegal in XML 1.0?

OTHER TIPS

Because it's no valid XML character, ie it should produce a parse error. This is likely because of historical reasons (null terminated strings) and because of XML's plain-text nature: Anything on which a Unicode-capable editor might choke is discouraged...

It shouldn't 'terminate the file', but it should generate a well-formedness error. It's disallowed because so much of the world is still using null-terminated string processing, so allowing a \0 is likely to cause trouble at some unspecified point down the processing chain.

This can possibly even be a security vulnerability; there have been many exploits in the past that have relied on the interfacing of systems which allow \0 and those which take it as a terminator. The safest thing to do, therefore, is simply to disallow it.

Other control characters can be escaped as &#...; character references elsewhere in XML 1.1, but not in CDATA sections. In XML 1.0 there is no way to get control characters in at all. It is, after all, supposed to be a text-based, human-readable format.

Base64?

Yes. But if you are processing mostly big chunks of binary, encapsulating it in XML is probably not a reasonable choice.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top