Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

One xml namespace equals one and only one schema file?

Tags:

...or Why do these files validate in Visual Studio 2010 but not with xmllint1?

I'm currently working against a published xml schema where the original author's habit is to break down the schemas into several .xsd-files, but where some schema files have the same targetNamespace. Is this really "allowed"?

Example (extremely simplified):

File    targetNamespace    Contents ------------------------------------------------------------ b1.xsd  uri:tempuri.org:b  complex type "fooType" b2.xsd  uri:tempuri.org:b  simple type "barType"  a.xsd   uri:tempuri.org:a  imports b1.xsd and b2.xsd                            definition of root element "foo", that                            extends "b:fooType" with an attribute                            of "b:barType" 

(Complete file contents below.)

Then I have an xml file, data.xml, with this content:

<?xml version="1.0"?> <foo bar="1" xmlns="uri:tempuri.org:a" xmlns:xs="http://www.w3.org/2001/XMLSchema" /> 

For a long time, I have believed that all of this was correct, since Visual Studio apparently allows this schema style. However, today I decided to set up a command line utility for validating xml files, and I chose xmllint.

When I ran xmllint --schema a.xsd data.xml, I was presented with this warning:

a.xsd:4: element import: Schemas parser warning : Element '{http://www.w3.org/2001/XMLSchema}import': Skipping import of schema located at 'b2.xsd' for the namespace 'uri:tempuri.org:b', since this namespace was already imported with the schema located at 'b1.xsd'.

The fact that the import of b2.xsd was skipped obviously leads to this error:

a.xsd:9: element attribute: Schemas parser error : attribute decl. 'bar', attribute 'type': The QName value '{uri:tempuri.org:b}barType' does not resolve to a(n) simple type definition.

If xmllint is correct, there would be an error in the published specs I'm working against. Is there? And Visual Studio would be wrong. Is it?

I do realize the difference between xs:import and xs:include. Right now, I just don't see how xs:include could fix things, since:

  • b1.xsd and b2.xsd have the same targetNamespace
  • they both differ in targetNamespace from a.xsd
  • and they do not (need to) know about each other

Is this a flaw in the original schema specification? I'm beginning to think that the third bullet point is crucial. Should the fact that they don't know about each other have led to placing them in different namespaces to begin with?


b1.xsd:

<?xml version="1.0" encoding="utf-8"?> <xs:schema targetNamespace="uri:tempuri.org:b" xmlns:xs="http://www.w3.org/2001/XMLSchema">   <xs:complexType name="fooType" /> </xs:schema> 

b2.xsd:

<?xml version="1.0" encoding="utf-8"?> <xs:schema targetNamespace="uri:tempuri.org:b" xmlns:xs="http://www.w3.org/2001/XMLSchema">   <xs:simpleType name="barType">     <xs:restriction base="xs:integer" />   </xs:simpleType> </xs:schema> 

a.xsd:

<?xml version="1.0" encoding="utf-8"?> <xs:schema targetNamespace="uri:tempuri.org:a" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:b="uri:tempuri.org:b">   <xs:import namespace="uri:tempuri.org:b" schemaLocation="b1.xsd" />   <xs:import namespace="uri:tempuri.org:b" schemaLocation="b2.xsd" />   <xs:element name="foo">     <xs:complexType>       <xs:complexContent>         <xs:extension base="b:fooType">           <xs:attribute name="bar" type="b:barType" />         </xs:extension>       </xs:complexContent>     </xs:complexType>   </xs:element> </xs:schema> 

Notes:

1) I'm using the Windows port of libxml2/xmllint found at www.zlatkovic.com.

like image 895
Christoffer Lette Avatar asked Feb 14 '11 22:02

Christoffer Lette


People also ask

Can XML have multiple schemas?

Schemas can be composed of one or more XML documents. These schema documents can be explicitly joined together using the include and import elements.

Can XML have multiple namespaces?

When you use multiple namespaces in an XML document, you can define one namespace as the default namespace to create a cleaner looking document. The default namespace is declared in the root element and applies to all unqualified elements in the document. Default namespaces apply to elements only, not to attributes.

Can an XML schema define namespaces?

One of the primary motivations for defining an XML namespace is to avoid naming conflicts when using and re-using multiple vocabularies. XML Schema is used to create a vocabulary for an XML instance, and uses namespaces heavily.

What is the correct way of declaring an XML namespace?

When using prefixes in XML, a namespace for the prefix must be defined. The namespace can be defined by an xmlns attribute in the start tag of an element. The namespace declaration has the following syntax. xmlns:prefix="URI".


1 Answers

The crux of the problem here is what does it mean when you have two different <import> elements, when both of them refer to the same namespace.

It helps to clarify the meaning when you consider that the schemaLocation attribute of <import> is entirely optional. When you leave it out, you're just saying "I want to import schema of namespace XYZ into this schema". The schemaLocation is just a hint as to where to find the definition of that other schema.

The precise meaning of <import> is a bit fuzzy when you read the W3C spec, possibly deliberately so. As a result, interpretations vary.

Some XML processors tolerate multiple <import> for the same namespace, and essentially amalgamate all of the schemaLocation into a single target.

Other processors are stricter, and decide that only one <import> per target namespace is valid. I think this is more correct, when you consider that schemaLocation is optional.

In addition to the VS and xmllint examples you gave, Xerces-J is also super-strict, and ignores subsequent <import> for the same target namespace, giving much the same error as xmllint does. XML Spy, on the other hand, is much more permissive (but then, XML Spy's validation is notoriously flaky)

To be safe, you should not have these multiple imports. A given namespace should have a single "master" document, which in turn has an <include> for each sub-document. This master is often highly artificial, acting only as a container. for these sub-documents.

From what I've seen, this generally consists of "best practise" for XML Schema when it comes to maximum tool compatibility, but some will argue that it's a hack that takes away from elegant schema design.

Meh.

like image 162
skaffman Avatar answered Oct 12 '22 14:10

skaffman