I use the following XmlSchema:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.test.com/XmlValidation"
elementFormDefault="qualified"
attributeFormDefault="unqualified"
xmlns:m="http://www.test.com/XmlValidation">
<xs:element name="test">
<xs:complexType>
<xs:sequence>
<xs:element name="testElement" type="m:requiredStringType"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:simpleType name="requiredStringType">
<xs:restriction base="xs:string">
<xs:minLength value="1"/>
<xs:whiteSpace value="collapse"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
It defines a requiredStringType that must be at least one character long and also defines whitespace collapse.
When I validate the following Xml document the validation succeedes:
<?xml version="1.0" encoding="UTF-8"?>
<test xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.text.com/XmlValidation">
<testElement> </testElement>
</test>
w3.org defines for whitespace collapse:
"After the processing implied by replace, contiguous sequences of #x20's are collapsed to a single #x20, and leading and trailing #x20's are removed."
Does this mean that 3 whitespaces are collapsed to one or to zero whitespaces? In XmlSpy the validation fails, in .Net it succeeds.
In XML documents, there are two types of whitespace: Significant whitespace is part of the document content and should be preserved. Insignificant whitespace is used when editing XML documents for readability. These whitespaces are typically not intended for inclusion in the delivery of the document.
According to the XML standard, whitespace is space characters (U+0020), carriage returns (U+000D), line feeds (U+000A), or tabs (U+0009) that are in the document to improve readability.
Since it says that leading and trailing whitespace are removed, that means that a string that contains only whitespace will be collapsed to an empty string. XmlSpy is being accurate in the validation and .NET is being generous (or is making an error).
This is according to White Space Normalization during Validation from XML Schema Part 1: Structures Second Edition.
preserve
No normalization is done, the value is the ·normalized value·
replace
All occurrences of #x9 (tab), #xA (line feed) and #xD (carriage return) are replaced > with #x20 (space).
collapse
Subsequent to the replacements specified above under replace, contiguous sequences of #x20s are collapsed to a single #x20, and initial and/or final #x20s are deleted.
Thus, first all whitespace is replaced by blank characters, second contiguous sequences are replaced with a single blank character, third and last, initial and final blanks are deleted. Following this sequence, a string containing only whitespace must be normalized to an empty string during validation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With