Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the XmlReader skipping elements?

Tags:

c#

.net

xmlreader

Please note this question is specific to XmlReader and not whether to use XDocument or XmlReader.

I have an XML fragment as:

private string GetXmlFragment()
{
    return @"<bookstore>
          <book genre='novel' ISBN='10-861003-324'>
            <title>The Handmaid's Tale</title>
            <price>19.95</price>
          </book>
          <book genre='novel' ISBN='1-861001-57-5'>
            <title>Pride And Prejudice</title>
            <price>24.95</price>
          </book>
        </bookstore>";
}

I also have an extension method as:

public static IEnumerable<XElement> GetElement(this XmlReader reader, string elementName)
{
    reader.MoveToElement();

    while (reader.Read())
    {
        if (reader.NodeType == XmlNodeType.Element 
            && reader.Name.Equals(elementName, StringComparison.InvariantCulture))
        {
            yield return XNode.ReadFrom(reader) as XElement;
        }
    }
}

I then try to get the two book elements by doing:

var xmlReaderSettings = new XmlReaderSettings
{
    CheckCharacters = false,
    ConformanceLevel = ConformanceLevel.Fragment,
    IgnoreComments = true,
    IgnoreWhitespace = true,
    IgnoreProcessingInstructions = true
};

using (var stringReader = new StringReader(this.GetXmlFragment()))
using (var xmlReader = XmlReader.Create(stringReader, xmlReaderSettings))
{
    xmlReader.GetElement("book").Count().ShouldBe(2);
}

However I only get the first element, debugging shows that as soon as I get the first element the reader jumps to the title of the second book element.

The solution is inspired from HERE

Any help is much appreciated.

like image 741
babayi Avatar asked Apr 28 '16 08:04

babayi


2 Answers

The problem is that, if there is no intervening whitespace, the call to XNode.ReadFrom() will leave the XML reader positioned right at the next element. The while condition then immediately consumes this element before we can check it. The fix is to not call XmlReader.Read() immediately afterwards, but to continue checking for nodes (as the read has been done implicitly):

while (reader.Read()) {
    while (reader.NodeType == XmlNodeType.Element 
           && reader.Name.Equals(elementName, StringComparison.InvariantCulture)) {
        yield return XNode.ReadFrom(reader) as XElement;
    }
}

(In case it's not clear, the if in the loop has been changed to a while.)

like image 95
Jeroen Mostert Avatar answered Nov 13 '22 15:11

Jeroen Mostert


public static IEnumerable<XElement> GetElement(this XmlReader reader, string elementName)
{
    while (!reader.EOF)
        if (reader.NodeType == XmlNodeType.Element && reader.Name == "book")
            yield return XNode.ReadFrom(reader) as XElement;
        else
            reader.Read();
}
like image 2
Denis Fedak Avatar answered Nov 13 '22 14:11

Denis Fedak