Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl xml simple for parsing node with the same name

I have the following xml file

<?xml version="1.0"?>
<!DOCTYPE pathway SYSTEM "http://www.kegg.jp/kegg/xml/KGML_v0.7.1_.dtd">
<pathway name="path:ko01200" org="ko" >
    <entry id="1" >
        <graphics name="one" 
             type="circle" />
    </entry>
    <entry id="7" >
        <graphics name="one" 
             type="rectangle" />
        <graphics name="two" 
             type="rectangle"/>
    </entry>
</pathway>

I tired to pars it using xml simple with the following code which I am stuck since one of the nodes had 2 graphic elements. So it complains. I assume I have to have another foreach loop for graphic elements but I don't know how to proceed .

use strict;
use warnings;
use XML::Simple;
use Data::Dumper;

my $xml=new XML::Simple;
my $data=$xml->XMLin("file.xml",KeyAttr => ['id']);
print Dumper($data);    
foreach my $entry (   keys %{$data->{entry}} ) {
    print $data->{entry}->{$entry}->{graphics}->{type}."\n";            
}

here is the code result

$VAR1 = {
      'entry' => {
                 '1' => {
                        'graphics' => {
                                      'name' => 'one...',
                                      'type' => 'circle'
                                    }
                      },
                 '7' => {
                        'graphics' => [
                                      {
                                        'name' => 'one',
                                        'type' => 'rectangle'
                                      },
                                      {
                                        'name' => 'two',
                                        'type' => 'rectangle'
                                      }
                                    ]
                      }
               },
      'org' => 'ko',
      'name' => 'path:ko01200'
    };
circle
Not a HASH reference at stack.pl line 12.
like image 788
user1876128 Avatar asked Sep 30 '13 08:09

user1876128


1 Answers

XML::Simple lacks consistency because it's up to the user to enable strict mode, so graphics node is sometimes hash, sometimes array depending on number of child elements.

for my $entry ( keys %{$data->{entry}} ) {

    my $graphics = $data->{entry}{$entry}{graphics};
    $graphics = [ $graphics ] if ref $graphics eq "HASH";
    print "$_->{type}\n" for @$graphics;
}

There are better modules for XML parsing, please check XML::LibXML

or as @RobEarl suggested use ForceArray parameter:

 XMLin("file.xml",KeyAttr => ['id'], ForceArray => [ 'graphics' ]);
like image 52
mpapec Avatar answered Oct 13 '22 17:10

mpapec