My current task involves writing a class library for processing HL7 CDA files.
These HL7 CDA files are XML files with a defined XML schema, so I used xsd.exe to generate .NET classes for XML serialization and deserialization.
The XML Schema contains various types which contain the mixed="true" attribute, specifying that an XML node of this type may contain normal text mixed with other XML nodes.
The relevant part of the XML schema for one of these types looks like this:
<xs:complexType name="StrucDoc.Paragraph" mixed="true">
<xs:sequence>
<xs:element name="caption" type="StrucDoc.Caption" minOccurs="0"/>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="br" type="StrucDoc.Br"/>
<xs:element name="sub" type="StrucDoc.Sub"/>
<xs:element name="sup" type="StrucDoc.Sup"/>
<!-- ...other possible nodes... -->
</xs:choice>
</xs:sequence>
<xs:attribute name="ID" type="xs:ID"/>
<!-- ...other attributes... -->
</xs:complexType>
The generated code for this type looks like this:
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")]
public partial class StrucDocParagraph {
private StrucDocCaption captionField;
private object[] itemsField;
private string[] textField;
private string idField;
// ...fields for other attributes...
/// <remarks/>
public StrucDocCaption caption {
get {
return this.captionField;
}
set {
this.captionField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))]
[System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))]
[System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))]
// ...other possible nodes...
public object[] Items {
get {
return this.itemsField;
}
set {
this.itemsField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlTextAttribute()]
public string[] Text {
get {
return this.textField;
}
set {
this.textField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")]
public string ID {
get {
return this.idField;
}
set {
this.idField = value;
}
}
// ...properties for other attributes...
}
If I deserialize an XML element where the paragraph node looks like this:
<paragraph>first line<br /><br />third line</paragraph>
The result is that the item and text arrays are read like this:
itemsField = new object[]
{
new StrucDocBr(),
new StrucDocBr(),
};
textField = new string[]
{
"first line",
"third line",
};
From this there is no possible way to determine the exact order of the text and the other nodes.
If I serialize this again, the result looks exactly like this:
<paragraph>
<br />
<br />first linethird line
</paragraph>
The default serializer just serializes the items first and then the text.
I tried implementing IXmlSerializable
on the StrucDocParagraph class so that I could control the deserialization and serialization of the content, but it's rather complex since there are so many classes involved and I didn't come to a solution yet because I don't know if the effort pays off.
Is there some kind of easy workaround to this problem, or is it even possible by doing custom serialization via IXmlSerializable
?
Or should I just use XmlDocument
or XmlReader
/XmlWriter
to process these documents?
As with the CreatePo method, you must first construct an XmlSerializer, passing the type of class to be deserialized to the constructor. Also, a FileStream is required to read the XML document. To deserialize the objects, call the Deserialize method with the FileStream as an argument.
Serialization is a process by which an object's state is transformed in some serial data format, such as XML or binary format. Deserialization, on the other hand, is used to convert the byte of data, such as XML or binary data, to object type.
XML serialization is the process of converting an object's public properties and fields to a serial format (in this case, XML) for storage or transport. Deserialization re-creates the object in its original state from the XML output.
XML serialization is the process of converting XML data from its representation in the XQuery and XPath data model, which is the hierarchical format it has in a Db2® database, to the serialized string format that it has in an application.
To solve this problem I had to modify the generated classes:
XmlTextAttribute
from the Text
property to the Items
property and add the parameter Type = typeof(string)
Text
propertytextField
fieldAs a result the generated code (modified) looks like this:
/// <remarks/>
[System.CodeDom.Compiler.GeneratedCodeAttribute("xsd", "2.0.50727.3038")]
[System.SerializableAttribute()]
[System.Diagnostics.DebuggerStepThroughAttribute()]
[System.ComponentModel.DesignerCategoryAttribute("code")]
[System.Xml.Serialization.XmlTypeAttribute(TypeName="StrucDoc.Paragraph", Namespace="urn:hl7-org:v3")]
public partial class StrucDocParagraph {
private StrucDocCaption captionField;
private object[] itemsField;
private string idField;
// ...fields for other attributes...
/// <remarks/>
public StrucDocCaption caption {
get {
return this.captionField;
}
set {
this.captionField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute("br", typeof(StrucDocBr))]
[System.Xml.Serialization.XmlElementAttribute("sub", typeof(StrucDocSub))]
[System.Xml.Serialization.XmlElementAttribute("sup", typeof(StrucDocSup))]
// ...other possible nodes...
[System.Xml.Serialization.XmlTextAttribute(typeof(string))]
public object[] Items {
get {
return this.itemsField;
}
set {
this.itemsField = value;
}
}
/// <remarks/>
[System.Xml.Serialization.XmlAttributeAttribute(DataType="ID")]
public string ID {
get {
return this.idField;
}
set {
this.idField = value;
}
}
// ...properties for other attributes...
}
Now if I deserialize an XML element where the paragraph node looks like this:
<paragraph>first line<br /><br />third line</paragraph>
The result is that the item array is read like this:
itemsField = new object[]
{
"first line",
new StrucDocBr(),
new StrucDocBr(),
"third line",
};
This is exactly what I need, the order of the items and their content is correct.
And if I serialize this again, the result is again correct:
<paragraph>first line<br /><br />third line</paragraph>
What pointed me in the right direction was the answer by Guillaume, I also thought that it must be possible like this. And then there was this in the MSDN documentation to XmlTextAttribute
:
You can apply the XmlTextAttribute to a field or property that returns an array of strings. You can also apply the attribute to an array of type Object but you must set the Type property to string. In that case, any strings inserted into the array are serialized as XML text.
So the serialization and deserialization work correct now, but I don't know if there are any other side effects. Maybe it's not possible to generate a schema from these classes with xsd.exe anymore, but I don't need that anyway.
I had the same problem as this, and came across this solution of altering the .cs generated by xsd.exe. Although it did work, I wasn't comfortable with altering the generated code, as I would need to remember to do it any time I regenerated the classes. It also led to some awkward code which had to test for and cast to XmlNode[] for the mailto elements.
My solution was to rethink the xsd. I ditched the use of the mixed type, and essentially defined my own mixed type.
I had this
XML: <text>some text <mailto>[email protected]</mailto>some more text</text>
<xs:complexType name="text" mixed="true">
<xs:sequence>
<xs:element minOccurs="0" maxOccurs="unbounded" name="mailto" type="xs:string" />
</xs:sequence>
</xs:complexType>
and changed to
XML: <mytext><text>some text </text><mailto>[email protected]</mailto><text>some more text</text></mytext>
<xs:complexType name="mytext">
<xs:sequence>
<xs:choice minOccurs="0" maxOccurs="unbounded">
<xs:element name="text">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string" />
</xs:simpleContent>
</xs:complexType>
</xs:element>
<xs:element name="mailto">
<xs:complexType>
<xs:simpleContent>
<xs:extension base="xs:string" />
</xs:simpleContent>
</xs:complexType>
</xs:element>
</xs:choice>
</xs:sequence>
</xs:complexType>
My generated code now gives me a class myText:
public partial class myText{
private object[] itemsField;
/// <remarks/>
[System.Xml.Serialization.XmlElementAttribute("mailto", typeof(myTextTextMailto))]
[System.Xml.Serialization.XmlElementAttribute("text", typeof(myTextText))]
public object[] Items {
get {
return this.itemsField;
}
set {
this.itemsField = value;
}
}
}
the order of the elements is now preserved in the serilization/deserialisation, but i do have to test for/ cast to/program against the types myTextTextMailto
and myTextText
.
Just thought I'd throw that in as an alternative approach which worked for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With