First of all, don't use XML::Simple. it is hard to predict what exact data structure it will produce from a bit of XML, and it's own documentation mentions it is deprecated.
Anyway, your problem is that you want to access an id
field in the product
and substrate
subhashes – but they don't exist in one of the reaction
subhashes
'15' => {
'substrate' => {
'104' => {
'name' => 'cpd:C00118'
},
'109' => {
'name' => 'cpd:C05382'
}
},
'name' => 'rn:R01641',
'type' => 'reversible',
'product' => {
'112' => {
'name' => 'cpd:C00231'
},
'110' => {
'name' => 'cpd:C00117'
}
}
},
Instead, the keys are numbers, and each value is a hash containing a name
. The other reaction
has a totally different structure, so special-case code would have been written for both. This is why XML::Simple
shouldn't be used – the output is just to unpredictable.
Enter XML::LibXML
. It is not extraordinary, but it implememts standard APIs like the DOM and XPath to traverse your XML document.
use XML::LibXML;
use feature 'say'; # assuming perl 5.010
my $doc = XML::LibXML->load_xml(file => "test.xml") or die;
for my $reaction_item ($doc->findnodes('//reaction/product | //reaction/substrate')) {
say $reaction_item->getAttribute('id');
}
Output:
108
109
109
104
110
112