It's not clear to me exactly what the goal is - your output data format doesn't look particularly desirable. Nonetheless the example below should be enough to get you on your way. It addresses two points:
- That 'xref' is missing in your current output.
- How to add arbitrary whitespace (basically PCDATA content) to a document
As a side note: I've not used XML::Twig before; the documentation is actually pretty good if you are comfortable with XML concepts.
use strict;
use warnings;
use XML::Twig;
my $twig = XML::Twig->new(
twig_handlers => {
'fig' => \&figure
},
pretty_print => 'indented',
);
$twig->parse(do { local $/; <DATA> });
$twig->print;
sub figure {
my ( $twig, $figure ) = @_;
# Find all children of type label (would there really be more than 1??)
foreach my $label ($figure->children('label')) {
# Replace the label with its chidren nodes
$label->replace_with($label->cut_children);
# Find the caption and place 4 spaces before it
if (my $caption = $figure->first_child('caption')) {
my $some_whitespace = XML::Twig::Elt->new('#PCDATA' => ' ');
$some_whitespace->paste(before => $caption);
}
}
}
__DATA__
<xml>
<fig id="fig6_4">
<label><xref ref-type="page" id="page_54"/>[Figure 4]</label>
<caption>The Klein Sexual Orientation Grid</caption>
</fig>
</xml>