When using the perl library Storable to store and retrieve XML::LibXML document objects, I get a segmentation fault. Specifically, once I use the LibXML routine findnodes. My question is: why, and is there a way around it?
For example consider the following code:
use strict;
use warnings;
use XML::LibXML;
use Storable;
use Devel::Size qw/size/;
use Data::Dumper qw/Dumper/;
my $file = "test.xml";
my $store_file = "./dom.xml.store";
my $dom;
if(-e $store_file ){
$dom = retrieve $store_file;
print "dom retrieved from storable $store_file\n";
}else{
$dom = XML::LibXML->load_xml(location => $file);
print "dom retrieved from xml $file\n";
store $dom, $store_file;
}
print Dumper $dom;
print "Dom ref(". ref($dom) . "), size(" . size($dom) . ")\n";
#test
foreach my $title ($dom->findnodes('//title')){
print $title->to_literal() . "\n";
}
Running this code twice yields the following result:
$~/perl:perl libxml.pl
dom retrieved from xml test.xml
$VAR1 = bless( do{\(my $o = '93874673292592')}, 'XML::LibXML::Document' );
Dom ref(XML::LibXML::Document), size(72)
Apollo 13
Solaris
Ender's Game
Interstellar
The Martian
$~/perl:perl libxml.pl
dom retrieved from storable ./dom.xml.store
$VAR1 = bless( do{\(my $o = '93874673292592')}, 'XML::LibXML::Document' );
Dom ref(XML::LibXML::Document), size(72)
Segmentation fault
$~/perl:
The xml file which I named test.xml file is the same file retrieved from the tutorial here. Similar tests were run on my system with unblessed and blessed perl objects, neither of which caused the segmentation fault.
Usual disclaimers: perl version: v5.26.3 Storable version 3.11 LibXML version 2.0132
Storable is only used to store Perl values ie HASH, SCALAR, ARRAY, REF, etc. When you store a XML::LibXML::Document using Storable, it doesn't store the underlying memory that XML::LibXML has allocated. Thus your segmentation fault when the re-created blessed hash attempts to read what it thinks is allocated memory.
You should be re-creating the XML::LibXML::Document in this program, rather than relying on Storable. XML is a text-based format, you should be storing the data as such.
Also, using Storable is generally considered a bad idea.
If the purpose of your use of Storable is to serialize and store the XML::LibXML::Document then XML::LibXML itself has methods for that: toString and serialize (an alias)
toString is a DOM serializing function, so the DOM Tree is serialized into an XML string, ready for output
Then you can simply write this string literal ("XML string") to the disk and later you can create a DOM tree from it. Given that this is specifically for an XML::LibXML::Document I don't see a need to store it in any particular format other than the XML (string) itself but if you want that for some reason an easy format to recommend is JSON. I use Cpanel::JSON::XS library..
Please read the linked docs carefully for differences from toString methods provided on nodes other than Document, and for other versions of toString and other ways to serialize the DOM. The serialize method is later in the page explained to be an alias, for naming consistency.
The problem you are having is due to the fact that Storable is meant to
The Storable package brings persistence to your Perl data structures containing SCALAR, ARRAY, HASH or REF objects, [...]
and an XML::LibXML::Document, with a representation of a DOM tree, is not one of those pure Perl objects but is rather a far more complex creature (mostly implemented in C), as mentioned in the Rawley Fowler answer. Also see the discussion about security risks with Storable, linked to in that answer as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With