When it comes to documenting the structure of XML files...
One of my co-workers does it in a Word table.
Another pastes the elements into a Word document with comments like this:
<learningobject id="{Learning Object Id (same value as the loid tag)}" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://www.aicpcu.org/schemas/cms_lo.xsd"> <objectRoot> <v> <!-- Current version of the object from the repository. !--> <!-- (Occurance: 1) --> </v> <label> <!-- Name of the object from the repository. !--> <!-- (Occurance: 0 or 1 or Many) --> </label> </objectRoot>
Which one of these methods is preferred? Is there a better way?
Are there other options that do not require third party Schema Documenter tools to update?
XML data structures consist of elements, nested child elements, and attributes that Analytics identifies when it analyzes an XML file. They are displayed in the XML Data Structures treeview, which is a hierarchical representation of the XML file.
An XML document is a basic unit of XML information composed of elements and other markup in an orderly package. An XML document can contains wide variety of data. For example, database of numbers, numbers representing molecular structure or a mathematical equation.
XML (Extensible Markup Language) is a markup language like HTML for storage or transmission of data. XML is widely used in web services to transport data over the network. XML has no predefined tags, unlike HTML. XML is very easy to parse and generate.
An XML document is called well-formed if it satisfies certain rules, specified by the W3C. These rules are: A well-formed XML document must have a corresponding end tag for all of its start tags. Nesting of elements within each other in an XML document must be proper.
I'd write an XML Schema (XSD) file to define the structure of the XML document. xs:annotation
and xs:documentation
tags can be included to describe the elements. The XSD file can be transformed into documentation using XSLT stylesheets such as xs3p or tools such as XML Schema Documenter.
For an introduction to XML Schema see the XML Schools tutorial.
Here is your example, expressed as XML Schema with xs:annotation
tags:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="objectroot"> <xs:complexType> <xs:sequence> <xs:element name="v" type="xs:string"> <xs:annotation> <xs:documentation>Current version of the object from the repository.</xs:documentation> </xs:annotation> </xs:element> <xs:element name="label" minOccurs="0" maxOccurs="unbounded" type="xs:string"> <xs:annotation> <xs:documentation>Name of the object from the repository.</xs:documentation> </xs:annotation> </xs:element> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Experimenting with various XML schema languages, I have found RELAX NG the best fit for most of the cases (reasoning at the end).
I have added one attribute, to illustrate also this type of structure in the documentation.
<objectRoot created="2015-05-06T20:46:56+02:00"> <v> <!-- Current version of the object from the repository. !--> <!-- (Occurance: 1) --> </v> <label> <!-- Name of the object from the repository. !--> <!-- (Occurance: 0 or 1 or Many) --> </label> </objectRoot>
RELAX NG allows describing sample XML structure in the following way:
start = ## Container for one object element objectRoot { ## datetime of object creation attribute created { xsd:dateTime }, ## Current version of the object from the repository ## Occurrence 1 is assumed by default element v { text }, ## Name of the object from the repository ## Note: the occurrence is denoted by the "*" and means 0 or more element label { text }* }
I think, it is very hard to beat the simplicity, keeping given level of expressiveness.
##
prefix, which is automatically translates into documentation element in other schema format. Single hash #
translates into XML comment and not a documentation element.multiple consecutive comments (as in the example) will turn into single multi-line documentation string within single element.
obvious fact: the inline XML comments in doc.xml
are irrelevant, only what is in schema.rnc
counts.
Assuming you have a (open sourced) tool called trang
available, you may create an XML Schema file as follows:
$ trang schema.rnc schema.xsd
Resulting schema looks like this:
<?xml version="1.0" encoding="UTF-8"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"> <xs:element name="objectRoot"> <xs:annotation> <xs:documentation>Container for one object</xs:documentation> </xs:annotation> <xs:complexType> <xs:sequence> <xs:element ref="v"/> <xs:element minOccurs="0" maxOccurs="unbounded" ref="label"/> </xs:sequence> <xs:attribute name="created" use="required" type="xs:dateTime"> <xs:annotation> <xs:documentation>datetime of object creation</xs:documentation> </xs:annotation> </xs:attribute> </xs:complexType> </xs:element> <xs:element name="v" type="xs:string"> <xs:annotation> <xs:documentation>Current version of the object from the repository Occurance 1 is assumed by default</xs:documentation> </xs:annotation> </xs:element> <xs:element name="label" type="xs:string"> <xs:annotation> <xs:documentation>Name of the object from the repository Note: the occurance is denoted by the "*" and means 0 or more</xs:documentation> </xs:annotation> </xs:element> </xs:schema>
Now can your clients, insisting on using only XML Schema 1.0 use your XML document specification.
There are open source tools like jing
and rnv
supporting RELAX NG Compact syntax and working on both Linux as well as on MS Windows.
Note: those tools are rather old, but very stable. Read it as a sign of stability not as sign of being obsolete.
Using jing:
$ jing -c schema.rnc doc.xml
The -c
is important, jing
by default assumes RELAX NG in XML form.
Using rnv
to check, the schema.rnc
itself is valid:
$ rnv -c schema.rnc
and to validate doc.xml
:
$ rnv schema.rnc doc.xml
rnv
allows validating multiple documents at once:
$ rnv schema.rnc doc.xml otherdoc.xml anotherone.xml
For the requirement defined above, RELAX NG Compact syntax looks like the best fit. With RELAX NG you get both - human readable schema which is even usable for automated validation.
Existing limitations do not come into effect very often and can be in many cases resolved by comments or other means.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With