If the data is on a single line the
index=int.Parse(logDataReader.ReadElementContentAsString());
and
value=double.Parse(logDataReader.ReadElementContentAsString(),
cause the cursor to move forward. If I take those calls out I see it loop 6 times in debug.
In the following only 3 <data>
are read (and they are wrong as the value is for the next index) on the first (<logData id="Bravo">
). On the second (<logData id="Bravo">
) all <data>
are read.
It is not an option to edit the xml and put in line breaks as that file is created dynamically (by XMLwriter). The NewLineChars
setting is a line feed. From XMLwriter it is actually just one line - I broke it down to figure out where it was breaking. In the browser it is displayed properly.
How to fix this?
Here is my XML:
<?xml version="1.0" encoding="utf-8"?>
<log>
<logData id="Alpha">
<data><index>100</index><value>150</value></data>
<data><index>110</index><value>750</value></data>
<data><index>120</index><value>750</value></data>
<data><index>130</index><value>150</value></data>
<data><index>140</index><value>0</value></data>
<data><index>150</index><value>222</value></data>
</logData>
<logData id="Bravo">
<data>
<index>100</index>
<value>25</value>
</data>
<data>
<index>110</index>
<value>11</value>
</data>
<data>
<index>120</index>
<value>1</value>
</data>
<data>
<index>130</index>
<value>25</value></data>
<data>
<index>140</index>
<value>0</value>
</data>
<data>
<index>150</index>
<value>1</value>
</data>
</logData>
</log>
And my code:
static void Main(string[] args)
{
List<LogData> logDatas = GetLogDatasFromFile("singleVersusMultLine.xml");
Debug.WriteLine("Main");
Debug.WriteLine("logData");
foreach (LogData logData in logDatas)
{
Debug.WriteLine($" logData.ID {logData.ID}");
foreach(LogPoint logPoint in logData.LogPoints)
{
Debug.WriteLine($" logData.Index {logPoint.Index} logData.Value {logPoint.Value}");
}
}
Debug.WriteLine("end");
}
public static List<LogData> GetLogDatasFromFile(string xmlFile)
{
List<LogData> logDatas = new List<LogData>();
using (XmlReader reader = XmlReader.Create(xmlFile))
{
// move to next "logData"
while (reader.ReadToFollowing("logData"))
{
var logData = new LogData(reader.GetAttribute("id"));
using (var logDataReader = reader.ReadSubtree())
{
// inside "logData" subtree, move to next "data"
while (logDataReader.ReadToFollowing("data"))
{
// move to index
logDataReader.ReadToFollowing("index");
// read index
var index = int.Parse(logDataReader.ReadElementContentAsString());
// move to value
logDataReader.ReadToFollowing("value");
// read value
var value = double.Parse(logDataReader.ReadElementContentAsString(), CultureInfo.InvariantCulture);
logData.LogPoints.Add(new LogPoint(index, value));
}
}
logDatas.Add(logData);
}
}
return logDatas;
}
public class LogData
{
public string ID { get; }
public List<LogPoint> LogPoints { get; } = new List<LogPoint>();
public LogData (string id)
{
ID = id;
}
}
public class LogPoint
{
public int Index { get; }
public double Value { get; }
public LogPoint ( int index, double value)
{
Index = index;
Value = value;
}
}
Your problem is as follows. According to the documentation for XmlReader.ReadElementContentAsString()
:
This method reads the start tag, the contents of the element, and moves the reader past the end element tag.
And from the documentation for XmlReader.ReadToFollowing(String)
:
It advances the reader to the next following element that matches the specified name and returns true if a matching element is found.
Thus, after the call to ReadElementContentAsString()
, since the reader has been advanced to the next node, it might already be positioned on the next <value>
or <data>
node. Then when you call ReadToFollowing()
this element node is skipped because the method unconditionally moves on to the next node with the correct name. But if the XML is indented then the next node immediately after the call to ReadElementContentAsString()
will be an XmlNodeType.Whitespace
node, protecting against this bug.
The solution is to check whether the reader is already positioned correctly after the call to ReadElementContentAsString()
. First, introduce the following extension method:
public static class XmlReaderExtensions
{
public static bool ReadToFollowingOrCurrent(this XmlReader reader, string localName, string namespaceURI)
{
if (reader == null)
throw new ArgumentNullException(nameof(reader));
if (reader.NodeType == XmlNodeType.Element && reader.LocalName == localName && reader.NamespaceURI == namespaceURI)
return true;
return reader.ReadToFollowing(localName, namespaceURI);
}
}
Then modify your code as follows:
public static List<LogData> GetLogDatasFromFile(string xmlFile)
{
List<LogData> logDatas = new List<LogData>();
using (XmlReader reader = XmlReader.Create(xmlFile))
{
// move to next "logData"
while (reader.ReadToFollowing("logData", ""))
{
var logData = new LogData(reader.GetAttribute("id"));
using (var logDataReader = reader.ReadSubtree())
{
// inside "logData" subtree, move to next "data"
while (logDataReader.ReadToFollowing("data", ""))
{
// move to index
logDataReader.ReadToFollowing("index", "");
// read index
var index = XmlConvert.ToInt32(logDataReader.ReadElementContentAsString());
// move to value
logDataReader.ReadToFollowingOrCurrent("value", "");
// read value
var value = XmlConvert.ToDouble(logDataReader.ReadElementContentAsString());
logData.LogPoints.Add(new LogPoint(index, value));
}
}
logDatas.Add(logData);
}
}
return logDatas;
}
Notes:
Always prefer to use XmlReader
methods in which the local name and namespace are specified separately, such as XmlReader.ReadToFollowing (String, String)
. When you use a method such as XmlReader.ReadToFollowing(String)
which accepts a single qualified name, you are implicitly hardcoding the choice of XML prefix, which is generally not a good idea. XML parsing should be independent of prefix choice.
While you correctly parsed your double using the CultureInfo.InvariantCulture
locale, it's even easier to use the methods from the XmlConvert
class to handle parsing and formatting correctly.
XmlReader.ReadSubtree()
leaves the XmlReader
positioned on the EndElement
node of the element being read, so you shouldn't need to call ReadToFollowingOrCurrent()
afterwards. (Nice use of ReadSubtree()
to avoid reading too little or too much by the way; by using this method one can avoid several frequent mistakes with XmlReader
.)
As you have found, code that manually reads XML using XmlReader
should always be unit-tested with both formatted and unformatted XML, because certain bugs will only arise with one or the other. (See e.g. this answer, this one and this one also for other examples of such.)
Working sample .Net fiddle here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With