If you read Section 2.7 of the XML specification, it describes the format of a CDATA section:
CDATA Sections
[18] CDSect ::= CDStart CData CDEnd
[19] CDStart ::= '<![CDATA['
[20] CData ::= (Char* - (Char* ']]>' Char*))
[21] CDEnd ::= ']]>'
Char
is defined in Section 2.2:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
If you look at your raw data, it contains over a dozen character values that are excluded from that range (specifically #x0
, #x1
, #x2
, #x4
, #x5
, #x6
, #x8
, #xB
#xE
, #x18
, #x19
, #x1A
, and #x1C
). That is why you are getting errors about illegal characters, because you really do have illegal characters.
A CDATA section does not give you permission to put arbitrary binary data into an XML data. A CDATA section is meant to be used when text content contains characters that are normally reserved for XML markup, so that they do not have to be escaped or encoded as entities. The only way to put binary data into an XML document is to encode it in an XML-compatible (typically 7bit ASCII) format, such as Base64 (but there are other formats available that you can use, such as yEnc).