Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XML Schema: Can I make some of an attribute's values be required but still allow other values?

Tags:

xml

schema

xsd

(Note: I cannot change structure of the XML I receive. I am only able to change how I validate it.)

Let's say I can get XML like this:

<Address Field="Street" Value="123 Main"/>
<Address Field="StreetPartTwo" Value="Unit B"/>
<Address Field="State" Value="CO"/>
<Address Field="Zip" Value="80020"/>
<Address Field="SomeOtherCrazyValue" Value="Foo"/>

I need to create an XSD schema that validates that "Street", "State" and "Zip" must be present. But I don't care if either "StreetPartTwo" and/or "SomeOtherCrazyValue" happen to be present too.

If I knew that only the three I care about could be included (and that each would only be included once), I could do something like this:

<xs:element name="Address" type="addressType" maxOccurs="unbounded" minOccurs="3"/>

<xs:complexType name="addressType">
  <xs:attribute name="Field" use="required">
    <xs:simpleType>
      <xs:restriction base="xs:string">
        <xs:enumeration value="Street"/>
        <xs:enumeration value="State"/>
        <xs:enumeration value="Zip"/>
      </xs:restriction>
    </xs:simpleType>
  </xs:attribute>
</xs:complexType>

But this won't work with my case because I may also receive those other Address elements (that also have "Field" attributes) that I don't care about.

Any ideas how I can ensure the stuff I care about is present but let the other stuff in too?

TIA! Sean

like image 850
scrotty Avatar asked May 05 '10 23:05

scrotty


People also ask

How do I restrict attribute values in XML Schema?

Restrictions on a Set of Values To limit the content of an XML element to a set of acceptable values, we would use the enumeration constraint. Note: In this case the type "carType" can be used by other elements because it is not a part of the "car" element.

Which of the following attributes of schema is mandatory?

That is: the attribute "option" with some unknown value should be mandatory in any document using the complex type containing this element.

Which allows you to specify which child elements an element can contain and to provide some structure within your XML documents?

Which allows you to specify which child elements an element can contain and to provide some structure within your XML documents? 8. A schema describes? Explanation: A schema describes All of the above.


1 Answers

You cannot do the validation you seek, with just XML Schema.

According to the "XML Schema Part 1: Structures" specification ...

When two or more particles contained directly or indirectly in the {particles} of a model group have identically named element declarations as their {term}, the type definitions of those declarations must be the same.

It's not to say that you cannot build a schema that will validate a correct document. What it means is, you cannot build a schema that will fail to validate on some incorrect documents. And when I say "incorrect", I mean documents that violate the constraints you stated in English.

For example, suppose you have a document that includes three Street elements, like this:

<Address Field="Street" Value="123 Main"/> 
<Address Field="Street" Value="456 Main"/> 
<Address Field="Street" Value="789 Main"/> 
<Address Field="SomeOtherCrazyValue" Value="Foo"/> 

According to your schema, that document is a valid address. It's possible to add a xs:unique constraint to your schema so that it would reject such broken documents. But even with a xs:unique, validating against such a schema would declare that some other incorrect documents are valid - for example a document with three <Address> elements, each of which has a unique Field attribute, but none of which has Field="Zip".

In fact it is not possible to produce a W3C XML Schema that formally codifies your stated constraints. The <xs:all> element almost gets you threre, but it applies only to elements, not to attributes. And, it cannot be used with an extension, so you can't say, in W3C XML Schema, "all these elements in any order, plus any other ones".


In order to perform the validation you seek, your options are:

  1. rely on something other than XML Schema,
  2. perform validation in multiple steps, using XML Schema for the first step, and something else for the second step.

For the first option, I think you could use Relax NG to do it. The downside of that is, it's not a standard and as far as I can tell, it is neither widely supported nor growing. It would be like learning Gaelic in order to express a thought. There's nothing wrong with Gaelic, but it's sort of a linguistic cul-de-sac, and I think RelaxNG is, too.

For the second option, an approach would be to validate against your schema as the first step, and then, as the second step:

A. apply an XSL transform which would convert <Address> elements into elements named for the value of their Field attribute. The output of that transform would look like this:

<root>
  <Street Value="101 Bellavista Drive"/>
  <State  Value="Confusion"/>
  <Zip    Value="10101"/>
</root>

B. validate the output of that transform against a different schema, which looks something like this:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
           elementFormDefault="qualified">
  <xs:element name="root">
    <xs:complexType>
      <xs:all>
        <xs:element maxOccurs="1" minOccurs="1" ref="Street" />
        <xs:element maxOccurs="1" minOccurs="1" ref="State" />
        <xs:element maxOccurs="1" minOccurs="1" ref="Zip" />
      </xs:all>
    </xs:complexType>
  </xs:element>

  <xs:element name="Street">
    <xs:complexType>
      <xs:attribute name="Value" use="required" type="xs:string"/>
    </xs:complexType>
  </xs:element>
  <xs:element name="State">
    <xs:complexType>
      <xs:attribute name="Value" use="required" type="xs:string"/>
    </xs:complexType>
  </xs:element>
  <xs:element name="Zip">
    <xs:complexType>
      <xs:attribute name="Value" use="required" type="xs:string"/>
    </xs:complexType>
  </xs:element>

</xs:schema>

You would need to extend that schema to handle other elements like <SomeOtherCrazyValue> in the output of the transform. Or you could structure the xsl transform to just not emit elements that are not one of {State,Street,Zip}.

Just to be clear, I understand that you cannot change the XML that you receive. This approach wouldn't require that. It just uses a funky 2-step validation approach. Once the 2nd validation step completes, you could discard the result of the transform.


EDIT - Actually, Sean, thinking about this again, you could just use step B. Suppose your XSL transform just Removes from the document only <Address> elements that do not have State, Street or Zip for the Field attribute value. In other words, there would be no <Address Field="SomeOtherCrazyValue"...>. The result of that transform could be validated with your schema, using a maxOccurs="3", minOccurs="3", and an xs:unique.

like image 109
Cheeso Avatar answered Oct 09 '22 09:10

Cheeso