Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Schema Validation using Nokogiri

Tags:

ruby

xsd

nokogiri

I am trying to validate an XML document against a dozen or so schemas using Nokogiri. Currently I have a root schema document that imports all the other schemas, and I validate against that.

Can I point to each schema file from the XML file itself, and have Nokogiri look in the XML file for the schemas to validate against?

like image 325
jkjustjoshing Avatar asked Feb 13 '12 16:02

jkjustjoshing


1 Answers

The proper way to reference multiple schemata against which to validate an XML file is with the schemaLocation attribute:

<?xml version="1.0"?>
<foo xmlns="http://bar.com/foo"
     xmlns:bz="http://biz.biz/"
     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
     xsi:schemaLocation="http://bar.com/foo http://www.bar.com/schemas/foo.xsd
                         http://biz.biz/ http://biz.biz/xml/ns/bz.xsd">

For each namespace in your document you list a pair of whitespace-delimited values: the namespace URI followed by a 'hint' as to where to find the schema for that namespace. If you provide a full URI for each hint, then you can process this with Nokogiri as such:

require 'nokogiri'
require 'open-uri'

doc = Nokogiri.XML( my_xml )
schemata_by_ns = Hash[ doc.root['schemaLocation'].scan(/(\S+)\s+(\S+)/) ]
schemata_by_ns.each do |ns,xsd_uri|
  xsd = Nokogiri::XML.Schema(open(xsd_uri))
  xsd.validate(doc).each do |error|
    puts error.message
  end
end

Disclaimer: I have never attempted to validate a single XML document using multiple namespaced schemata with Nokogiri before. As such, I have no direct experience to guarantee that the above validation will work. The validation code is based solely on Nokogiri's schema validation documentation.

like image 159
Phrogz Avatar answered Sep 22 '22 16:09

Phrogz