Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to validate XML using Nokogiri?

I have to validate an XML document so it will not accept an invalid XML document.

I did it this way to handle an invalid document:

xml ||= Nokogiri::XML xml_data do |config| 
  config.strict
end
rescue Nokogiri::XML::SyntaxError => e
  puts "caught exception: #{e}"
else
  #further processing if no error

But even for the valid XML document, it shows:

caught exception: Extra content at the end of the document

Sample XML i'm using:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE note SYSTEM "Note.dtd">
<note>
<to>Tove</to>
<from>Jani</from>
<heading>Reminder</heading>
<body>Don't forget me this weekend!</body>
</note>

What am I doing wrong?

like image 966
Ahmad hamza Avatar asked Mar 02 '26 02:03

Ahmad hamza


1 Answers

If you want to see whether a document is invalid XML, simply check the errors method of the returned document:

require 'nokogiri'

doc = Nokogiri::XML('<xml><foo></xml>')
doc.errors
# => [#<Nokogiri::XML::SyntaxError: Opening and ending tag mismatch: foo line 1 and xml>,
#     #<Nokogiri::XML::SyntaxError: Premature end of data in tag xml line 1>]

If Nokogiri finds any errors it'll populate the errors array.

like image 195
the Tin Man Avatar answered Mar 03 '26 17:03

the Tin Man