Given the following XML snippet: <pre class="prettyprint"><code><outline> <node1 attribute1="value1" attribute2="value2"> text1 </node1> </outline> </code></pre> How do I get this output? <pre class="prettyprint"><code>outline node1=text1 node1 attribute1=value1 node1 attribute2=value2 </code></pre> I have looked into <code>use XML::LibXML::Reader;</code>, but that module appears to only provide access to attribute values referenced by their names. And how do I get the list of attribute names in the first place?

Something like this should help you. It's not clear from your question whether <code><outline></code> is the root element of the data, or if it is buried somewhere in a bigger document. It's also unclear how general you want the solution to be - e.g. do you want the entire document dumped in this manner? Anyway, this program generates the output you requested from the given XML input in a fairly concise manner. <pre class="prettyprint"><code>use strict; use warnings; use 5.014; #' For /r non-destructive substitution mode use XML::LibXML; my $xml = XML::LibXML->load_xml(IO => \*DATA); my ($node) = $xml->findnodes('//outline'); print $node->nodeName, "\n"; for my $child ($node->getChildrenByTagName('*')) { my $name = $child->nodeName; printf "%s=%s\n", $name, $child->textContent =~ s/\A\s+|\s+\z//gr; for my $attr ($child->attributes) { printf "%s %s=%s\n", $name, $attr->getName, $attr->getValue; } } __DATA__ <outline> <node1 attribute1="value1" attribute2="value2"> text1 </node1> </outline> </code></pre> output <pre class="prettyprint"><code>outline node1=text1 node1 attribute1=value1 node1 attribute2=value2 </code></pre>

You find the list of attributes by doing <code>$e->findnodes( "./@*");</code> Below is a solution, with plain XML::LibXML, not XML::LibXML::Reader, that works with your test data. It may be sensitive to extra whitespace and mixed-content though, so test it on real data before using it. <pre class="prettyprint"><code>#!/usr/bin/perl use strict; use warnings; use XML::LibXML; my $dom= XML::LibXML->load_xml( IO => \*DATA); my $e= $dom->findnodes( "//*"); foreach my $e (@$e) { print $e->nodeName; # text needs to be trimmed or line returns show up in the output my $text= $e->textContent; $text=~s{^\s*}{}; $text=~s{\s*$}{}; if( ! $e->getChildrenByTagName( '*') && $text) { print "=$text"; } print "\n"; my @attrs= $e->findnodes( "./@*"); # or, as suggested by Borodin below, $e->attributes foreach my $attr (@attrs) { print $e->nodeName, " ", $attr->nodeName. "=", $attr->value, "\n"; } } __END__ <outline> <node1 attribute1="value1" attribute2="value2"> text1 </node1> </outline> </code></pre>

How to list XML node attributes with XML::LibXML?

Tags:

attr

xml

perl

xml-libxml

Given the following XML snippet:

<outline>
  <node1 attribute1="value1" attribute2="value2">
    text1
  </node1>
</outline>

How do I get this output?

outline
node1=text1
node1 attribute1=value1
node1 attribute2=value2

I have looked into use XML::LibXML::Reader;, but that module appears to only provide access to attribute values referenced by their names. And how do I get the list of attribute names in the first place?

211

asked Nov 07 '14 07:11

Alexander Shcheblikin

2 Answers

Something like this should help you.

It's not clear from your question whether <outline> is the root element of the data, or if it is buried somewhere in a bigger document. It's also unclear how general you want the solution to be - e.g. do you want the entire document dumped in this manner?

Anyway, this program generates the output you requested from the given XML input in a fairly concise manner.

use strict;
use warnings;
use 5.014;     #' For /r non-destructive substitution mode

use XML::LibXML;

my $xml = XML::LibXML->load_xml(IO => \*DATA);

my ($node) = $xml->findnodes('//outline');

print $node->nodeName, "\n";

for my $child ($node->getChildrenByTagName('*')) {
  my $name = $child->nodeName;

  printf "%s=%s\n", $name, $child->textContent =~ s/\A\s+|\s+\z//gr;

  for my $attr ($child->attributes) {
    printf "%s %s=%s\n", $name, $attr->getName, $attr->getValue;
  }
}

__DATA__
<outline>
  <node1 attribute1="value1" attribute2="value2">
    text1
  </node1>
</outline>

output

outline
node1=text1
node1 attribute1=value1
node1 attribute2=value2

answered Oct 11 '22 21:10

Borodin

You find the list of attributes by doing $e->findnodes( "./@*");

Below is a solution, with plain XML::LibXML, not XML::LibXML::Reader, that works with your test data. It may be sensitive to extra whitespace and mixed-content though, so test it on real data before using it.

#!/usr/bin/perl

use strict;
use warnings;

use XML::LibXML;

my $dom= XML::LibXML->load_xml( IO => \*DATA);
my $e= $dom->findnodes( "//*");

foreach my $e (@$e)
  { print $e->nodeName;

    # text needs to be trimmed or line returns show up in the output
    my $text= $e->textContent;
    $text=~s{^\s*}{};
    $text=~s{\s*$}{};

    if( ! $e->getChildrenByTagName( '*') && $text)
      { print "=$text"; }
    print "\n"; 

    my @attrs= $e->findnodes( "./@*");
    # or, as suggested by Borodin below, $e->attributes

    foreach my $attr (@attrs)
      { print $e->nodeName, " ", $attr->nodeName. "=", $attr->value, "\n"; }
  }
__END__
<outline>
  <node1 attribute1="value1" attribute2="value2">
    text1
  </node1>
</outline>

answered Oct 11 '22 20:10

mirod

Related questions
                            
                                Read entire elements from an XML network stream
                            
                                Android: Launcher Icon width and height in DP
                            
                                Setting the namespace during a parse
                            
                                How to get validation events with JaXB?
                            
                                Apply groups on already created menu
                            
                                XML walking in python [closed]
                            
                                How do I remove a node in xml using ElementTree in Python?
                            
                                Intellij IDEA CE 12 Android XML Code Completion not working
                            
                                Error in weblogic.xml : cvc-complex-type.2.4.a: Invalid content was found starting with element 'prefer-application-packages'
                            
                                How to select XML child node using its baseName instead of Item(#)?
                            
                                how to send an XML request to an API with R
                            
                                Entity was referenced but not declared
                            
                                magento cron job and cron_scheduler table
                            
                                ElementNSImpl to String
                            
                                Generic XSLT to tabluate XML
                            
                                Rabl.render: how to use view helper methods?
                            
                                PHP XPath search returning 0 results
                            
                                Problems converting XML to JSON using XSLT
                            
                                XmlPullParser - unexpected token (android)
                            
                                Generate a XSD from a JAXB-annotated class without using File

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With