I have a simple xml
<data>
<node1>value1</node1>
<node2>value2</node2>
</data>
I'm using IXmlSerializable to read and write such xml with DTOs. The following code works just fine
XmlReader reader;
...
while( reader.Read() ){
Console.Write( reader.ReadElementContentAsString() );
}
// outputs value1value2
However, if whitespaces in the xml are removed, i.e.
<data>
<node1>value1</node1><node2>value2</node2>
</data>
or I use XmlReaderSettings.IgnoreWhitespace = true;
, the code outputs only "value1" ignoring the second node. When I print the nodes that the parser traverses, I can see that ReadElementContentAsString
moves the pointer to the EndElement
of node2
, but I don't understand why that should be happening or how to fix it.
Is it a possible XML parser implementation bug?
===============================================
Here's a sample code and 2 xml samples that produce different results
string homedir = Path.GetDirectoryName(Application.ExecutablePath);
string xml = Path.Combine( homedir, "settings.xml" );
FileStream stream = new FileStream( xml, FileMode.Open );
XmlReaderSettings readerSettings = new XmlReaderSettings();
readerSettings.IgnoreWhitespace = false;
XmlReader reader = XmlTextReader.Create( stream, readerSettings );
while( reader.Read() ){
if ( reader.MoveToContent() == XmlNodeType.Element && reader.Name != "data" ){
System.Diagnostics.Trace.WriteLine(
reader.NodeType
+ " "
+ reader.Name
+ " "
+ reader.ReadElementContentAsString()
);
}
}
stream.Close();
1.) settings.xml
<?xml version="1.0"?>
<data>
<node-1>value1</node-1>
<node-2>value2</node-2>
</data>
2.) settings.xml
<?xml version="1.0"?>
<data>
<node-1>value1</node-1><node-2>value2</node-2>
</data>
using (1) prints
Element node-1 value1
Element node-2 value2
using (2) prints
Element node-1 value1
C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...
What is C? C is a general-purpose programming language created by Dennis Ritchie at the Bell Laboratories in 1972. It is a very popular language, despite being old. C is strongly associated with UNIX, as it was developed to write the UNIX operating system.
Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.
In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.
Per the documentation on IgnoreWhitespace, a new line is not considered insignificant.
White space that is not considered to be significant includes spaces, tabs, and blank lines used to set apart the markup for greater readability. An example of this is white space in element content.
XmlReaderSettings.IgnoreWhitespace
It happens that reader.Read()
read the white space character. Ignoring the spaces, the same same instruction read the second element ("gnam" a XML token), indeed bringing the pointer to the node2 element.
Debug the reader
properties before and after the methods called in you example. Check for NodeType and Value properties. Give also a check for MoveToContent method also, it is very useful.
Read the documentation of all that methods and properties, and you will end up to learn how XmlReader class works, and how you use it for your purposes. Here is the first google result: it contains a very explicit example.
I ended up to the following (not complete) pattern:
private static void ReadXmlExt(XmlReader xmlReader, IXmlSerializableExt xmlSerializable, ReadElementDelegate readElementCallback)
{
bool isEmpty;
if (xmlReader == null)
throw new ArgumentNullException("xmlReader");
if (readElementCallback == null)
throw new ArgumentNullException("readElementCallback");
// Empty element?
isEmpty = xmlReader.IsEmptyElement;
// Decode attributes
if ((xmlReader.HasAttributes == true) && (xmlSerializable != null))
xmlSerializable.ReadAttributes(xmlReader);
// Read the root start element
xmlReader.ReadStartElement();
// Decode elements
if (isEmpty == false) {
do {
// Read document till next element
xmlReader.MoveToContent();
if (xmlReader.NodeType == XmlNodeType.Element) {
string elementName = xmlReader.LocalName;
// Empty element?
isEmpty = xmlReader.IsEmptyElement;
// Decode child element
readElementCallback(xmlReader);
xmlReader.MoveToContent();
// Read the child end element (not empty)
if (isEmpty == false) {
// Delegate check: it has to reach and end element
if (xmlReader.NodeType != XmlNodeType.EndElement)
throw new InvalidOperationException(String.Format("not reached the end element"));
// Delegate check: the end element shall correspond to the start element before delegate
if (xmlReader.LocalName != elementName)
throw new InvalidOperationException(String.Format("not reached the relative end element of {0}", elementName));
// Child end element
xmlReader.ReadEndElement();
}
} else if (xmlReader.NodeType == XmlNodeType.Text) {
if (xmlSerializable != null) {
// Interface
xmlSerializable.ReadText(xmlReader);
Debug.Assert(xmlReader.NodeType != XmlNodeType.Text, "IXmlSerializableExt.ReadText shall read the text");
} else
xmlReader.Skip(); // Skip text
}
} while (xmlReader.NodeType != XmlNodeType.EndElement);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With