I notice that XML::RSS::Parser hasn't been updated since 2005. Is this still the recommended library for parsing RSS or Atom? Is there a better one or a better way?
I'm not sure it's ever been the "recommended library". If I know which kind of feed I need to parse, I use XML::RSS or XML::Atom as appropriate, but if (as is more likely) I just know it's a web feed, I use XML::Feed.
Adding an example of using XML::Feed as requested..
use XML::Feed;
my $feed = XML::Feed->parse(\$string_containing_feed);
foreach ($feed->entries) {
print $_->title, "\n";
print $_->content->body, "\n";
}
This is all pretty much copied from the module documentation.
I actually like to avoid domain-specific XML parsers these days and just use XPath for everything. That way I only have to remember one API. (Unless it's a huge XML, then I'll use an event-based parser like XML::Parser.)
So using XML::XPath, I can grab a bunch of stuff from an RSS file like this:
my $rss = get_rss();
my $xp = XML::XPath->new( xml => $rss );
my $stories = $xp->find( '/rss/channel/item' );
foreach my $story( $stories->get_nodelist ) {
my $url = $xp->find( 'link', $story )->string_value;
my $title = $xp->find( 'title', $story )->string_value;
...
}
Not the prettiest code in the world, but it works.
If XML::RSS::Parser works for you then use it. I've used XML::Parser to deal with RSS but I had narrow requirements and XML::Parser was already installed.
Just because something has been updated in a few years doesn't mean that it doesn't work anymore; I don't think the various RSS/Atom specs have changed recently so there's no need for the parser to change.
There is also a very nice module called XML::FeedPP
(see http://search.cpan.org/dist/XML-FeedPP/lib/XML/FeedPP.pm). FeedPP
is no so fast but it writen in almost pure Perl and has minimalistic dependencies.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With