The XML file is like this:
<?xml version="1.0" encoding="UTF-8"?>
<resource-data xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="resource-data.xsd">
<class name="AP">
<attributes>
<resourceId>00 11 B5 1B 6D 20</resourceId>
<lastModifyTime>20130107091545</lastModifyTime>
<dcTime>20130107093019</dcTime>
<attribute name="NMS_ID" value="DNMS" />
<attribute name="IP_ADDR" value="10.11.141.111" />
<attribute name="LABEL_DEV" value="00 11 B5 1B 6D 20" />
</attributes>
<attributes>
<resourceId>00 11 B5 1B 6D 21</resourceId>
<lastModifyTime>20130107091546</lastModifyTime>
<dcTime>20130107093019</dcTime>
<attribute name="NMS_ID" value="DNMS" />
<attribute name="IP_ADDR" value="10.11.141.112" />
<attribute name="LABEL_DEV" value="00 11 B5 1B 6D 21" />
</attributes>
</class>
</resource-data>
And my code:
#!/usr/bin/perl
use Encode;
use XML::LibXML;
use Data::Dumper;
$parser = new XML::LibXML;
$struct = $parser->parse_file("d:/AP_201301073100_1.xml");
my $file_data = "d:\\ap.txt";
open IN, ">$file_data";
$rootel = $struct->getDocumentElement();
$elname = $rootel->getName();
@kids = $rootel->getElementsByTagName('attributes');
foreach $child (@kids) {
@atts = $child->getElementsByTagName('attribute');
foreach $at (@atts) {
$va = $at->getAttribute('value');
print IN encode("gbk", "$va\t");
}
print IN encode("gbk", "\n");
}
close(IN);
My question is, if the XML file is only 80MB then then program will be very fast, but when the XML file is much larger the program can then be very slow. Can somebody help me speed this up please?
Using XML::Twig
will allow you to process each <attributes>
element as it is encountered during parsing, and then discard the XML data that is no longer needed.
This program seems to do what you need.
use strict;
use warnings;
use XML::Twig;
use Encode;
use constant XML_FILE => 'S:/AP_201301073100_1.xml';
use constant OUT_FILE => 'D:/ap.txt';
open my $outfh, '>:encoding(gbk)', OUT_FILE or die $!;
my $twig = XML::Twig->new(twig_handlers => {attributes => \&attributes});
$twig->parsefile('myxml.xml');
sub attributes {
my ($twig, $atts) = @_;
my @values = map $_->att('value'), $atts->children('attribute');
print $outfh join("\t", @values), "\n";
$twig->purge;
}
output
DNMS 10.11.141.111 00 11 B5 1B 6D 20
DNMS 10.11.141.112 00 11 B5 1B 6D 21
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With