Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XSD circular import

Tags:

java

xsd

xsom

I need to parse a XSD with XSOM but this XSD contains circular imports.

A.xsd

<xs:schema xmlns=”ns1” targetNamespace=”ns1”>
  <xs:import namespace=”ns2” schemaLocation=”B.xsd”/>
  <xs:element name=”MyElement” type=”xs:string”/>
</xs:schema>

B.xsd

<xs:schema xmlns=”ns2” targetNamespace=”ns2” xmlns:ns1=”ns1”>
  <xs:import namespace=”ns1” schemaLocation=”A.xsd”/>
  <xs:complexType name="MyComplex">
    <xs:sequence>
      <xs:element ref="ns1:MyElement" minOccurs="0"/>
    <xs:sequence>
  <xs:complexType>
</xs:schema>

XSOM can’t parse the schema because it detects elements that have already been defined due to circular imports. So I tried to break the circular import by externalizing the elements that are defined by A and used in B.

C.xsd contains element from A that are used by B. Note that these elements are not used in A. Don’t ask me why these have been defined in A.

<xs:schema xmlns=”ns1” targetNamespace=”ns1”>
  <xs:element name=”MyElement” type=”xs:string”/>
</xs:schema>

A.xsd becomes

<xs:schema xmlns=”ns1” targetNamespace=”ns1”>
  <xs:import namespace=”ns2” schemaLocation=”B.xsd”/>
</xs:schema>

B.xsd (import C.xsd instead of A.xsd) becomes

<xs:schema xmlns=”ns2” targetNamespace=”ns2” xmlns:ns1=”ns1”>
  <xs:import namespace=”ns1” schemaLocation=”C.xsd”/>
  <xs:complexType name="MyComplex">
    <xs:sequence>
      <xs:element ref="ns1:MyElement" minOccurs="0"/>
    <xs:sequence>
  <xs:complexType>
</xs:schema>

XSOM can parse the XSD. But now I can’t create the schema with the following code:

SchemaFactory sf = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
sf.setResourceResolver(new MyResourceResolver());

I use the standard implementation bundled with the JDK 1.7. I get the exception:

src-resolve: Cannot resolve the name 'ns1:MyElement' to a(n) 'element declaration' component.

The issue is that the resource resolver is called for the B namespace but not for the A namespace which makes sense. Since the namespace A is shared by A.xsd and C.xsd, the resource resolver can’t find the elements defined in C.xsd.

Are circular imports valid? Is it possible to break a circular import so it can be parsed by XSOM and then loaded by the SchemaFactory?

like image 255
Sydney Avatar asked Feb 12 '13 16:02

Sydney


People also ask

What is XSD import?

<xsd:import> ElementIdentifies a namespace whose schema components are referenced by the containing schema. Copy. <import id = ID namespace = anyURI schemaLocation = anyURI {any attributes with non-schema Namespace}...> Content: (annotation?) </import>

How do I add Xsd to XSD?

If you are giving Absolute path, then you need to write <xs:include schemaLocation="file:D:/workspace/Test/res/header. xsd" /> Remember to add the 'file:' before adding the location.


1 Answers

On the general question:

You ask "Are circular imports valid?" If by circularity you mean that there is a chain of schema documents S[1], S[2], ..., S[n] where schema document S[1] refers to schema document S[2] by name, S[2] to S[3], ... S[n-1] to S[n], and S[n] to S[1] then I don't believe the XSD 1.0 spec or the XSD 1.1 say clearly one way or the other. (Some WG members tried to persuade the WG to improve the clarity of its thinking on this and related topics, but failed.) Some implementations support circular import (and other forms of circularity), but I don't think it is possible to argue from the spec that your implementation is doing anything wrong.

If on the other hand you mean merely that there is a cycle such that for 0 <= i <= n-1, S[i] imports the namespace of S[i+1] and S[n] imports the namespace of S[1], then I believe that such cycles are clearly legal (and in some cases unavoidable).

The workaround I recommend is:

  1. In any schema document that declares anything, use xs:import as needed, but do not specify a schema location on the import. Cycles in such references are harmless.
  2. When invoking the schema processor, give it the full list of all the schema documents you want it to read, and if possible tell it via options or configuration not to read any other schema documents.
  3. If your schema document does not accept multiple schema documents as input at validation time, so you must have a single schema document that refers to everything you want to be read, or if you don't trust yourself to get the list of schema documents right at invocation time, then add a top-level driver document that does nothing but include and import the other schema documents you want to be read, with specific schema location information.

In your case, that would mean deleting the xs:import/@schemaLocation attribute from (the original forms of) A.xsd and B.xsd, and adding a driver document of the form

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:import namespace="ns1" schemaLocation="A.xsd"/>
  <xs:import namespace="ns2" schemaLocation="B.xsd"/>
</xs:schema>

The effect is to ensure that there are never cycles in schema documents' references to other schema documents; that eliminates a very large class of cases where XSD implementations are inconsistent with each other (and in some cases with themselves -- sometimes the same processor produces dramatically different results on the same inputs when the invocation names the inputs in a different order).

On the specific question:

In your example, there is no requirement that ns2 be imported by either A.xsd or C.xsd, because neither of them includes any references to any components in namespace ns2. So the cycle in your example seems gratuitous.

In your second example, you give some code which does not succeed in loading the schema. But I don't see any reference in that code to any specific schema document at all; unless there is something relevant you're not showing us, it's no wonder the validator can't find a declaration for {ns1}MyElement.

like image 151
C. M. Sperberg-McQueen Avatar answered Nov 15 '22 09:11

C. M. Sperberg-McQueen