When it comes to documenting the structure of XML files... One of my co-workers does it in a Word table. Another pastes the elements into a Word document with comments like this: <pre class="prettyprint"><code><learningobject id="{Learning Object Id (same value as the loid tag)}" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.aicpcu.org/schemas/cms_lo.xsd"> <objectRoot> <v>   </v> <label>   </label> </objectRoot> </code></pre> Which one of these methods is preferred? Is there a better way? Are there other options that do not require third party Schema Documenter tools to update?

I'd write an XML Schema (XSD) file to define the structure of the XML document. <code>xs:annotation</code> and <code>xs:documentation</code> tags can be included to describe the elements. The XSD file can be transformed into documentation using XSLT stylesheets such as xs3p or tools such as XML Schema Documenter. For an introduction to XML Schema see the XML Schools tutorial. Here is your example, expressed as XML Schema with <code>xs:annotation</code> tags: <pre class="prettyprint"><code><xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="objectroot"> <xs:complexType> <xs:sequence> <xs:element name="v" type="xs:string"> <xs:annotation> <xs:documentation>Current version of the object from the repository.</xs:documentation> </xs:annotation> </xs:element> <xs:element name="label" minOccurs="0" maxOccurs="unbounded" type="xs:string"> <xs:annotation> <xs:documentation>Name of the object from the repository.</xs:documentation> </xs:annotation> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema> </code></pre>

How to document the structure of XML files

Tags:

xsd

xml-documentation

When it comes to documenting the structure of XML files...

One of my co-workers does it in a Word table.

Another pastes the elements into a Word document with comments like this:

<learningobject id="{Learning Object Id (same value as the loid tag)}"              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"                  xsi:noNamespaceSchemaLocation="http://www.aicpcu.org/schemas/cms_lo.xsd">     <objectRoot>     <v>         <!-- Current version of the object from the repository. !-->         <!-- (Occurance: 1) -->     </v>     <label>         <!-- Name of the object from the repository. !-->         <!-- (Occurance: 0 or 1 or Many) -->     </label> </objectRoot>

Which one of these methods is preferred? Is there a better way?

Are there other options that do not require third party Schema Documenter tools to update?

238

asked Nov 17 '09 23:11

joe

2 Answers

I'd write an XML Schema (XSD) file to define the structure of the XML document. xs:annotation and xs:documentation tags can be included to describe the elements. The XSD file can be transformed into documentation using XSLT stylesheets such as xs3p or tools such as XML Schema Documenter.

For an introduction to XML Schema see the XML Schools tutorial.

Here is your example, expressed as XML Schema with xs:annotation tags:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">   <xs:element name="objectroot">     <xs:complexType>       <xs:sequence>                  <xs:element name="v" type="xs:string">           <xs:annotation>             <xs:documentation>Current version of the object from the repository.</xs:documentation>           </xs:annotation>         </xs:element>          <xs:element name="label" minOccurs="0" maxOccurs="unbounded" type="xs:string">           <xs:annotation>             <xs:documentation>Name of the object from the repository.</xs:documentation>           </xs:annotation>         </xs:element>                </xs:sequence>     </xs:complexType>   </xs:element> </xs:schema>

106

answered Sep 23 '22 06:09

Phil Ross

Enjoy RELAX NG compact syntax

Experimenting with various XML schema languages, I have found RELAX NG the best fit for most of the cases (reasoning at the end).

Requirements

Allow documenting XML document structure
Do it in readable form
Keep it simple for the author

Modified sample XML (doc.xml)

I have added one attribute, to illustrate also this type of structure in the documentation.

<objectRoot created="2015-05-06T20:46:56+02:00">     <v>         <!-- Current version of the object from the repository. !-->         <!-- (Occurance: 1) -->     </v>     <label>         <!-- Name of the object from the repository. !-->         <!-- (Occurance: 0 or 1 or Many) -->     </label> </objectRoot>

Use RELAX NG Compact syntax with comments (schema.rnc)

RELAX NG allows describing sample XML structure in the following way:

start =  ## Container for one object element objectRoot {      ## datetime of object creation     attribute created { xsd:dateTime },      ## Current version of the object from the repository     ## Occurrence 1 is assumed by default     element v {         text     },      ## Name of the object from the repository     ## Note: the occurrence is denoted by the "*" and means 0 or more     element label {         text     }* }

I think, it is very hard to beat the simplicity, keeping given level of expressiveness.

How to comment the structure

always place the comment before relevant element, not after it.
for readability, use one blank line before the comment block
use ## prefix, which is automatically translates into documentation element in other schema format. Single hash # translates into XML comment and not a documentation element.
multiple consecutive comments (as in the example) will turn into single multi-line documentation string within single element.
obvious fact: the inline XML comments in doc.xml are irrelevant, only what is in schema.rnc counts.

If XML Schema 1.0 is required, generate it (schema.xsd)

Assuming you have a (open sourced) tool called trang available, you may create an XML Schema file as follows:

$ trang schema.rnc schema.xsd

Resulting schema looks like this:

<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">   <xs:element name="objectRoot">     <xs:annotation>       <xs:documentation>Container for one object</xs:documentation>     </xs:annotation>     <xs:complexType>       <xs:sequence>         <xs:element ref="v"/>         <xs:element minOccurs="0" maxOccurs="unbounded" ref="label"/>       </xs:sequence>       <xs:attribute name="created" use="required" type="xs:dateTime">         <xs:annotation>           <xs:documentation>datetime of object creation</xs:documentation>         </xs:annotation>       </xs:attribute>     </xs:complexType>   </xs:element>   <xs:element name="v" type="xs:string">     <xs:annotation>       <xs:documentation>Current version of the object from the repository Occurance 1 is assumed by default</xs:documentation>     </xs:annotation>   </xs:element>   <xs:element name="label" type="xs:string">     <xs:annotation>       <xs:documentation>Name of the object from the repository Note: the occurance is denoted by the "*" and means 0 or more</xs:documentation>     </xs:annotation>   </xs:element> </xs:schema>

Now can your clients, insisting on using only XML Schema 1.0 use your XML document specification.

Validating doc.xml against schema.rnc

There are open source tools like jing and rnv supporting RELAX NG Compact syntax and working on both Linux as well as on MS Windows.

Note: those tools are rather old, but very stable. Read it as a sign of stability not as sign of being obsolete.

Using jing:

$ jing -c schema.rnc doc.xml

The -c is important, jing by default assumes RELAX NG in XML form.

Using rnv to check, the schema.rnc itself is valid:

$ rnv -c schema.rnc

and to validate doc.xml:

$ rnv schema.rnc doc.xml

rnv allows validating multiple documents at once:

$ rnv schema.rnc doc.xml otherdoc.xml anotherone.xml

RELAX NG Compact syntax - pros

very readable, even newbie should understand the text
easy to learn (RELAX NG comes with good tutorial, one can learn most of it within one day)
very flexible (despite the fact, it looks simple, it covers many situation, some of them cannot be even resolved by XML Schema 1.0).
some tools for converting into other formats (RELAX NG XML form, XML Schema 1.0, DTD, but even generation of sample XML document) exists.

RELAX NG limitations

multiplicity can be only "zero or one", "just one", "zero or more" or "one or more". (Multiplicity of small number of elements can be described by "stupid repetition" of "zero or one" definitions)
There are XML Schema 1.0 constructs, which cannot be described by RELAX NG.

Conclusions

For the requirement defined above, RELAX NG Compact syntax looks like the best fit. With RELAX NG you get both - human readable schema which is even usable for automated validation.

Existing limitations do not come into effect very often and can be in many cases resolved by comments or other means.

answered Sep 26 '22 06:09

Jan Vlcinsky

Related questions
                            
                                XML schemas with multiple inheritance
                            
                                XML Validation: "No Child Element Is Expected At This Point"
                            
                                How to generate @XmlRootElement Classes for Base Types in XSD?
                            
                                Middle way between XSD all and XSD sequence
                            
                                using xsd.exe to generate c# files, getting error and warnings
                            
                                Error: "schemaLocation value *** must have even number of URI's." on namespaces in spring dispatcher
                            
                                XML Schema key/keyref - how to use them?
                            
                                Difference between group and sequence in XML Schema?
                            
                                xmln:tns and targetNamespace
                            
                                tns appearing in Web Services schema
                            
                                What is the correct way of using the Guid type in a XSD file?
                            
                                XML Schema Validation : Cannot find the declaration of element
                            
                                How do i designate in XSD that an element only contains CDATA?
                            
                                Generate DataContract from XSD
                            
                                XSD Element Not Null or Empty Constraint For Xml?
                            
                                app.config "Could not find schema information" after converting to Visual Studio 2010 / .Net 4.0
                            
                                eclipse: validate xml with xsd
                            
                                Xsd and inheritance
                            
                                Define an XML element that must be empty and has no attributes
                            
                                How do I modify my settings to allow VS2010 to load 3rd party XSD files from the "Unauthorized Zone"?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With