In Perl, I'm using XML::Twig to read an XML file. Some of the attributes have text that look like this:

<p>Here is some text.</p>

<p>Some more text.

I'm reading this attribute into a variable named $Body. I'd like to print this variable out to a file without interpolating the special characters in the string, i.e., the output should look exactly like the input. My code looks like:

open (my $OUT, ">", "out.csv") or die $!;
print $OUT $Body;

However, when I look in out.csv, I see:

<p>Here is some text.</p>

<p>Some more text.

Instead, I'd like to see the original string:

&lt;p&gt;Here is some text.&lt;/p&gt;&#xA&;#xA;&lt;p&gt;Some more text.

I've tried the following with no success:

  • print $OUT '$Body'; Doesn't work, just shows "$Body"
  • print $OUT "$Body"; Doesn't work, same as no quotes.
  • print $OUT qw{$Body}; Doesn't work, just shows "$Body".

    Here is a complete example:

tmp.xml

<?xml version="1.0" encoding="utf-8"?>
<root>
  <node Body="&lt;p&gt;Here is some text.&lt;/p&gt;&#xA;&#xA;&lt;p&gt;Some more text."/>
</root>

Code:

#!/usr/bin/perl
use strict;
use XML::Twig;

my $t=XML::Twig->new();
$t->parsefile("tmp.xml"); 

my $root= $t->root;

open (my $OUT, ">", "out.csv") or die();

my @nodes = $root->children('node');   # get the para children
foreach my $node (@nodes){ 
    my $Body = $node->{'att'}->{'Body'}; 
    print $OUT $Body;
}

Result:

[dev@mogli:/swta] $ ./script.pl 
[dev@mogli:/swta] $ cat out.csv 
<p>Here is some text.</p>

<p>Some more text.
有帮助吗?

解决方案

XML::Twig is doing the unencoding. Pass it the keep_encoding flag to prevent this:

my $t = XML::Twig->new(keep_encoding => 1);

其他提示

Printing a scalar does not change it whatsoever[1].

$ cat a.pl
$Body = '&lt;p&gt;Here is some text.&lt;/p&gt;&#xA&#xA&lt;p&gt;Some more text.';
open (my $OUT, ">", "out.csv") or die();
print $OUT $Body;

$ perl a.pl

$ cat out.csv
&lt;p&gt;Here is some text.&lt;/p&gt;&#xA&#xA&lt;p&gt;Some more text.

$Body doesn't contain what you think it does. XML::Twig properly returned the node's content, <p>Here .... If the node is suppose to contain &lt;p&gt;Here ..., the XML file should contain &amp;lt;p&amp;gt;Here ....


Notes:

  1. Unless you instruct it to by adding an :encoding layer or some such, or unless you're on Windows which changes LF to CRLF by default.
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top