Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problem reading XML with Nokogiri

Tags:

xml

ruby

nokogiri

My Ruby script is supposed to read in an XML doc from a URL and check it for well-formedness, returning any errors. I have a sample bad XML document hosted with the following text (from the Nokogiri tutorial:

<?xml version="1.0"?>
  <root>
    <open>foo
      <closed>bar</closed>
  </root>

My test script is as follows (url refers to the above xml file hosted on my personal server):

require 'nokogiri'

document = Nokogiri::XML(url) 

puts document
puts document.errors

The output is:

<?xml version="1.0"?>
Start tag expected, '<' not found

Why is it only capturing the first line of the XML file? It does this with even with known good XML files.

like image 832
Paul Avatar asked Feb 24 '23 05:02

Paul


1 Answers

It is trying to parse the url, not its content. Please, take into account that first parameter to Nokogiri::XML must be a string containing the document or an IO object since it is just a shortcut to Nokogiri::XML::Document.parse as stated here.

EDIT: For reading from an uri

require 'open-uri'
open(uri).read
like image 163
Serabe Avatar answered Mar 01 '23 07:03

Serabe