Question

I am trying one simple PHP XML example as following.

// code of PHP
===================================================
<?php
  $string = <<<XML
<?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>George</to>
<from>John</from>
<heading>Reminder</heading>
<body>Don't forget the meeting!</body>
</note>
XML;

  print "BEGIN</br>"; 
  print "String:</br>{$string}";

  $xml = simplexml_load_string($string);
  print "</br>XML Obj:</br>";
  print_r($xml);
  print "</br>Var Dump:</br>";  
  var_dump($xml);
  print "</br>END"; 
?>
===================================================

and the outputs seems OK

// output
===================================================
BEGIN
String:
George John Reminder Don't forget the meeting!
XML Obj:
SimpleXMLElement Object ( [to] => George [from] => John [heading] => Reminder [body] => Don't forget the meeting! )
Var Dump:
object(SimpleXMLElement)#1 (4) { ["to"]=> string(6) "George" ["from"]=> string(4) "John" ["heading"]=> string(8) "Reminder" ["body"]=> string(25) "Don't forget the meeting!" }
END
===================================================

While when I try to format the first line of heredoc-style string by adding some blanks ahead of it, adding two blanks before following line <?xml version="1.0" encoding="ISO-8859-1"?>,then it always failed to output $xml Object info .

// code of PHP
===================================================
<?php
  $string = <<<XML
  <?xml version="1.0" encoding="ISO-8859-1"?>
<note>
<to>George</to>
<from>John</from>
<heading>Reminder</heading>
<body>Don't forget the meeting!</body>
</note>
XML;
... ...
===================================================

// output
===================================================
BEGIN
String:
George John Reminder Don't forget the meeting!
XML Obj:

Var Dump:
bool(false)
END

Hope someone can help me!!! Thanks very much.

Was it helpful?

Solution

This is no valid XML. Have a look at the XML specs for documents:

document    ::=     prolog element Misc*
prolog      ::=     XMLDecl? Misc* (doctypedecl Misc*)?
XMLDecl     ::=     '<?xml' VersionInfo EncodingDecl? SDDecl? S? '?>'

document decribes what other grammar tokens XML documents are build from, starting with a prolog. This may contain an XMLDecl, which again starts with <?xml. In short: there is no whitespace allowed before the declaration, if there is one.


If you are under control of the XML (which you seem to be): Just don't do that, I'd regard this to be malicious as everybody having to deal with the code after you will wonder what's going on.

If you cannot change the input, because somebody else is sending broken XML files:

  • tell him to ship well-formed XML or
  • "preprocess" the non-XML with trim($string) like already proposed in the comments.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top