I've seen several questions regarding problems with generating classes from XML Schema using xsd.exe
, along with suggestions for how to pre-process the schema (often using XSLT) to resolve some of the trickier aspects prior to generation. My question is whether it's possible to construct a C# code generator that is 100% compliant with XML Schema. Are the problems with xsd.exe
merely a question of its implementation, or do they point to a fundamental inconsistency between XML Schema and C#?
In particular, I'm interested in how to map concepts in XML Schema to C# - what are the accepted mappings, which mappings are debatable, are there XML Schema constructs that are inherently un-mappable and are there C# constructs that are underutilised? Is there a compliance specification that would provide rules for mapping, such that it could be implemented and tested?
EDIT: For the sake of clarity I'm fully aware that XML Schema won't provide me with fully implemented C# interfaces, I'm interested in whether it can be fully mapped to a C# class hierarchy.
EDIT 2: I've added a small bounty, as I'm interested in getting a bit more detail.
EDIT 3: Bounty still open, but so far heading toward stakx - a good answer but mainly dealing with how to replicate C# structures in XML Schema, rather than the other way round. Good input though.
XSD Restrictions/Facets Restrictions are used to define acceptable values for XML elements or attributes. Restrictions on XML elements are called facets.
The purpose of an XML Schema is to define the legal building blocks of an XML document: the elements and attributes that can appear in a document. the number of (and order of) child elements. data types for elements and attributes.
1.1 The Schema Namespace ( xs ) The XML representation of schema components uses a vocabulary identified by the namespace name http://www.w3.org/2001/XMLSchema . For brevity, the text and examples in this specification use the prefix xs: to stand for this namespace; in practice, any prefix can be used.
An element of type simpleType contains only text. It cannot have attributes and elements. An element of type complexType can contain text, elements, and attributes. An element of type complexType is parent to all the elements and attributes contained within it.
Interesting question. Not too long ago, I was wondering about exactly the same thing.
I will show a couple examples of how far I got. My demonstration will not be complete (considering that the XML Schema specification is fairly comprehensive), but it should suffice to show...
xsd.exe
(if you're willing to adhere to certain patterns when you write your XML Schema); andC# interfaces can be defined in XML Schema with complex types. For example:
<xsd:complexType name="IFoo" abstract="true">
<xsd:attribute name="Bar" type="xsd:string" use="required" />
<xsd:attribute name="Baz" type="xsd:int" use="optional" />
</xsd:complexType>
corresponds fairly well to:
interface IFoo
{
string Bar { get; set; }
int? Baz { get; set; }
}
The pattern here is that abstract and named (non-anonymous) complex types are basically the XML Schema equivalent of interfaces in C#.
Note some problems with the mapping:
C# access modifiers such as public
, internal
etc. cannot be rendered in XML Schema.
You have no way of expressing the difference between a C# field and a property in XML Schema.
You cannot define methods in XML Schema.
You also have no way of expressing the difference between a C# struct
and class
. (There's simply types in XML Schema, which roughly correspond to .NET value types; but they're much more restricted in XML Schema than complex types.)
The usage of usage="optional"
can be used to map nullable types. In XML Schema, you could define a string attribute as optional. Crossing over to C#, some loss in translation occurs: Since string
is a reference type, it cannot be declared as nullable (since it's already nullable by default).
XML Schema also allows usage="prohibited"
. This is again something that cannot be expressed in C#, or at least in a nice fashion (AFAIK).
From my experiments, it appears that xsd.exe
will never generate C# interfaces from abstract complex types; it will stay with abstract class
es instead. (I'm guessing that this is to keep the translation logic reasonably simple.)
Abstract classes can be done very similarly to interfaces:
<xsd:element name="FooBase" abstract="true">
<xsd:complexType>
...
</xsd:complexType>
</xsd:element>
Here, you define an element with the abstract
attribute set to true
, and embed an anonymous complex type inside it.
This corresponds to the following type declaration in C#:
abstract class FooBase { ... }
As above, but omit the abstract="true"
.
<xsd:complexType name="IFoo" abstract="true">
...
</xsd:complexType>
<xsd:element name="Foo" type="IFoo" />
This maps to:
interface IFoo { ... }
class Foo : IFoo { ... }
That is, you define both a named, abstract complex type (the interface), and a named element with that type.
Note that the C# code snippet above contains ...
twice, while the XML Schema snippet has only one ...
. How come?
Because you cannot define methods (code), and because you also cannot specify access modifiers, you don't need to "implement" a complex type with the element in XML Schema. The "implementation" of the complex type would be identical to the original declaration. If the complex type defines some attributes, these simply get mapped to auto-properties in a C# interface implementation.
Class and interface inheritance in XML Schema can be defined through a combination of type extensions and element substitution groups:
<xsd:element name="Base" type="base" />
<xsd:element name="Derived" substitutionGroup="Base" type="derived" />
<!-- ^^^^^^^^^^^^^^^^^^^^^^^^ -->
<xsd:complexType name="base">
<xsd:attribute name="Foo" type="xsd:boolean" use="required" />
</xsd:complexType>
<xsd:complexType name="derived">
<xsd:complexContent>
<xsd:extension base="base"> <!-- !!! -->
<xsd:attribute name="Bar" type="xsd:string" use="required" />
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>
This maps to:
class Base
{
bool Foo { get; set; }
}
class Derived : Base
{
string Bar { get; set; }
}
Note:
We're again using named complex types. But this time, they're not defined abstract="true"
, since we're not defining any C# interface type.
Note the references: Element Derived
is in Base
's substitution group; at the same time, complex type derived
is an extension of complex type base
. Derived
has type derived
, Base
has type base
.
Named complex types that are not abstract have no direct counterpart in C#. They're not classes, since they cannot be instantiated (in XML, elements, not types, have roughly the same function as value constructors in F# or object instantiation in C#); neither are they truly interfaces, since they are not declared abstract.
Showing how one would declare, in XML Schema, a C# class type that implements several interfaces.
Showing how complex content in XML Schema maps to C# (my first guess it that there's no correspondence in C# at all; at least not in the general case).
enum
s. (They are realised in XML Schema by restricting a simple type via enumeration
, btw.)
const
fields in a class (these would possibly map to attributes with a fixed
value).
How to map xsd:choice
, xsd:sequence
to C#; How to correctly map IEnumerable<T>
, ICollection<T>
, IList<T>
, IDictionary<TKey, TValue>
to XML Schema?
XML Schema simple types, which sound like they're the corresponding concept of .NET value types; but are far more restricted and have a different purpose.
There's many many more things that I haven't shown, but by now you can probably see the basic patterns behind my examples.
To do all this correctly, one would have to systematically go through the XML Schema specification and see how each concept mentioned there maps best to C#. (There's perhaps no single best solution, but several alternatives.) But I explicitly meant to show only a couple of interesting examples. I hope that was still informative enough!
It's not a limit to code generation. It's that XML schema does not describe classes. It describes XML, which is a different thing.
The result is that there is an "impedance mismatch" between XML Schema and C# classes, or Java classes, or any other kind of classes. The two are not equivalent, and are not meant to be.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With