Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I get content using XML::Twig?

Tags:

xml

perl

xml-twig

My aim is that start_tag_handler (see below) get the apps/title content when it finds an apps/title tag (see sample XML below).

And end_tag_handler gets the apps/logs content when it finds an apps/logs tag.

But instead this code returns null and exits.

This is the Perl code for parsing (using XML::Twig)###:

    #!/usr/local/bin/perl -w

    use XML::Twig;
    my $twig = XML::Twig->new(
                start_tag_handlers =>
                  { 'apps/title' => \&kicks
                  },
                twig_roots =>
                  { 'apps' => \&app
                  },
                end_tag_handlers =>
                  { 'apps/logs' => \&bye
                  }
                );
    $twig -> parsefile( "doc.xml");

    sub kicks {
        my ($twig, $elt) = @_;
        print "---kicks--- \n";
        print $elt -> text;
        print " \n";
    }

    sub app {
        my ($twig, $apps) = @_;
        print "---app--- \n";
        print $apps -> text;
        print " \n";
    }


    sub bye {
        my ($twig, $elt) = @_;
        print "bye \n";
        print $elt->text;
        print " \n";
    }

This is doc.xml###:

    <?xml version="1.0" encoding="UTF-8"?>
    <auto>
      <apps>
        <title>watch</title>
        <commands>set,start,00:00,alart,end</commands>
        <logs>csv</logs>
      </apps>
      <apps>
        <title>machine</title>
        <commands>down,select,vol_100,check,line,end</commands>
        <logs>dump</logs>
      </apps>
    </auto>

This is the output in the console###:

    C:\>perl parse.pl
    ---kicks---

    ---app---
    watchset,start,00:00,alart,endcsv
    ---kicks---

    ---app---
    machinedown,select,vol_100,check,line,enddump
like image 217
tknv Avatar asked Sep 19 '09 03:09

tknv


1 Answers

Check out the XML::Twig documentation for start_tag_handlers:

The handlers are called with 2 params: the twig and the element. The element is empty at that point, its attributes are created though.

At the time start_tag_handlers is called, the text content isn't even seen yet, since parsing of the start tag (e.g. <title>, not the end tag </title>) has only just completed.

The reason that end_tag_handlers don't supply element text is probably for symmetry :-).

What you want is probably to use twig_handlers instead:

my $twig = XML::Twig->new(
    twig_handlers => {
        'apps/title' => \&kicks,
        'apps/logs' => \&bye
    },
    twig_roots => {
        'apps' => \&app
    },
);

Output is:

---kicks--- 
watch 
bye 
csv 
---app--- 
watchset,start,00:00,alart,endcsv
---kicks--- 
machine 
bye 
dump 
---app--- 
machinedown,select,vol_100,check,line,enddump
like image 154
Inshallah Avatar answered Sep 21 '22 03:09

Inshallah