Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing simple XML with Nokogiri

I have the following XML:

<links>

  <item>
    <title>Title 1</title>
    <url>http://www.example.com/url-1</url>
  </item>

  <item>
   <title>Title 2</title>
   <url>http://www.example.com/url-2</url>
  </item>

  <item>
    <title>Title 3</title>
    <url>http://www.example.com/url-3</url>
  </item>

</links>

And, I would like to convert it to a HTML list:

<ul>
  <li><a href="http://www.example.com/url-1">Title 1</a></li>
  <li><a href="http://www.example.com/url-2">Title 2</a></li>
  <li><a href="http://www.example.com/url-3">Title 3</a></li>
</ul>

Currently I have this:

Controller:

require 'nokogiri'
doc = Nokogiri::XML(...)

@links = doc.xpath('//links/item').map do |i|
  {'title' => i.xpath('//title'), 'url' => i.xpath('//url')}
end

Template:

<ul>
  <% @links.each do |l| %>
    <li><a href="<%= l['url'] %>"><%= l['title'] %></a></li>
  <% end %>
</ul> 

Resulting HTML:

<ul>
  <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
  <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
  <li><a href="http://www.example.com/url-1http://www.example.com/url-2http://www.example.com/url-3">Title 1Title 2Title 3</a></li>
</ul>

What am I doing wrong? Is there a more optimal way of doing this?

like image 948
Vincent Avatar asked Oct 15 '10 01:10

Vincent


People also ask

How does nokogiri work?

Nokogiri makes an attempt to determine whether a CSS or XPath selector is being passed in. It's possible to create a selector that fools at or search so occasionally it will misunderstand, which is why we have the more specific versions of the methods.

Which gem is used to parse a .XML or .HTML document?

The nokogiri is a ruby gem that parses XML and HTML.

What is nokogiri gem used for?

Nokogiri (鋸) makes it easy and painless to work with XML and HTML from Ruby. It provides a sensible, easy-to-understand API for reading, writing, modifying, and querying documents. It is fast and standards-compliant by relying on native parsers like libxml2 (C) and xerces (Java).

What is XML used for?

XML (Extensible Markup Language) is used to describe data. The XML standard is a flexible way to create information formats and electronically share structured data via the public internet, as well as via corporate networks.


2 Answers

Replace this:

@links = doc.xpath('//links/item').map do |i| 
  {'title' => i.xpath('//title'), 'url' => i.xpath('//url')} 

with:

@links = doc.xpath('//links/item').map do |i| 
  {'title' => i.xpath('title'), 'url' => i.xpath('url')} 

Explanation:

//title 

and

//url

are absolute XPath expressions and they select all (respectively) title and all url elements in the XML document.

Contrast this with:

title

and

url

These are relative XPath expressions and select all (respectively) title and url children of the current node only.

like image 103
Dimitre Novatchev Avatar answered Oct 14 '22 06:10

Dimitre Novatchev


The trouble here is that the Xpath //title searches for titles from the root of the document, and so returns all title tags. Using the Xpath title searches within the context of the given node, like you want. Ditto on url.

@links = doc.xpath('//links/item').map do |i|
  {'title' => i.xpath('title'), 'url' => i.xpath('url')}
end
like image 31
Matchu Avatar answered Oct 14 '22 08:10

Matchu