Question

I'm trying to extract a small amount of data from an XML file into a csv file using perl and XML::Simple.

Here is an edited version of the data:

<?xml version="1.0" encoding="UTF-8"?>
<orders xmlns="http://www.demandware.com/xml/impex/order/2006-10-31">
    <order order-no="W100148941">
        <order-date>2011-08-22T16:15:47.000Z</order-date>
        <custom-attributes>
            <custom-attribute attribute-id="basket_notes">bnotes974211</custom-attribute>
            <custom-attribute attribute-id="omOrderID">974211</custom-attribute>
        </custom-attributes>
    </order>
</orders>

using this script:

#!/usr/bin/perl

use XML::Simple;
use Data::Dumper;

$xml = new XML::Simple;
$data = $xml->XMLin("$ARGV[0]", ForceArray=>1);


print Dumper($data);
foreach $o (@{$data->{order}}) {
    print "$ARGV[1]", ",";
    print "$ARGV[2]", ",";
    print "$ARGV[3]", ",";
    print "$ARGV[4]", ",";
    print $o->{"order-no"}, ",";
    print $o->{"order-date"}, ",";
    foreach my $o ( @{ $data->{'custom-attribute'} } ) {
        print 'in level 1';
        foreach my $attr ( @{ $data->{'custom-attribute'} } ) {
            print 'in level 2';
            if ( $attr->{'attribute-id'} eq 'basket_notes' ) {
                print '"', $data->{'content'}, '"', ",";
            }
        }
    }
    print "\n";
}

gets me this output:

,,,,W100148941,ARRAY(0x7f7f63a524c0),

Not using the ForceArray option XMLin will replace the ARRAY(...) above with the correct value, but won't work with files with only one data element, and, as is evident, this code never does make into the custom attribute array to print anything.

What am I doing wrong?

update:

changing the looping code in the above to this:

foreach $o (@{$data->{order}})
{
print "$ARGV[1]", ",";
print "$ARGV[2]", ",";
print "$ARGV[3]", ",";
print "$ARGV[4]", ",";
print $o->{"order-no"}, ",";
#print $o->{"order-date"}, ",";
print $o->{"order-date"}->[0], ",";
foreach my $o ( @{ $data->{'custom-attributes'} } ) {
    print 'in level 1';
   foreach my $attr ( @{ $o->{'custom-attribute'} } ) {
        print 'in level 2';
        if ( $attr->{'attribute-id'} eq 'omOrderID' ) {
            print '"', $data->{'content'}, '"', ",";
        }
    }
}

print "\n";
}

yields this:

,,,,W100148941,2011-08-22T16:15:47.000Z,

It would appear that the code is just not getting into the custom-attributes loop, and I don't know why.

Was it helpful?

Solution

Your problem is that "order-date" -due to ForceArray - is ALSO getting forced to be an arrayref, as you an see from your already-existing Dumper output:

...
     'order-date' => [
                     '2011-08-22T16:15:47.000Z'
                     ],

Therefore, you need to do one of 2 things:

  • If order-date will always be a single value, hard-code printing the first array value:

    print $o->{"order-date"}->[0], ",";
    
  • If order-date will always be a single value, change your constructor arguments by passing a more detailed ForceArray instructions.

    XML::Simple POD shows that aside from a simple ForceArray=>1 option, you can also pass a list of limited tags you want to force into array (e.g. ForceArray => [ "custom-attributes", "custom-attribute" ])

    • If order-date can have multiple tags, simply print it in a loop as you already do with other multiple tags below:

      foreach my $order_date ( @{ $data->{'order-date'} } ) { print "$order_date,"


Also, you have a couple of bugs in your nested loops.

Your first loop should be

foreach my $o ( @{ $data->{'custom-attributes'} } ) { # You had "attribute"

And the second loop should loop over the sub-structures of that:

    foreach my $attr ( @{ $o->{'custom-attribute'} } ) { # instead of $data->...

Leaving all that aside, from my fairly considerable experience, converting XML top a flat file (CSV) is somewhat of a bad idea, to put it mildly. Please seriously consider whether you are doing the right thing at all.

There is no way to properly or easily map the data without crafty encoding; and decoding that crafty encoding later is no easier than simply reading in the XML again.

  • If you need to convert it so it can be readable by another program, keep the XML or convert to JSON

  • If you need to convert it to show to a human, use Data::Dumper or some other pretty printer

  • If you need to show it to a human as a GUI, develop a good GUI to match your data structure.

OTHER TIPS

in addition to the answer from DVK:

I believe you need to enclose your outermost loop

foreach $o (@{$data->{order}})

in another loop, as the "order" items seem to be enclosed into "orders" items

    foreach $oo (@{$data->{orders}}) {
       foreach $o (@{$oo->{order}})
       {
       ....
       }
    }  #additional closing for the additional foreach

Best regards,

Olivier.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top