Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using XmlReader class to parse XML with elements of the same name

Tags:

c#

xml

xmlreader

I'm re-writing some code that uses a XmlDocument to parse some XML. I want to use a XmlReader instead to see if I can get some performance improvements. The structure of the XML looks like this:

<items>
   <item id="1" desc="one">
      <itemBody date="2012-11-12" />
   </item>
   <item id="2" desc="two">
      <itemBody date="2012-11-13" />
   </item>
   <item id="3" desc="three">
      <itemBody date="2012-11-14" />
   </item>
   <item id="4" desc="four">
      <itemBody date="2012-11-15" />
   </item>
</items>

Basically, I need to iterate through all the <item> elements. Like I said, the old code works like this:

XmlDocument document = new XmlDocument();

// load XML into XmlDocument
document.LoadXml(xml);

// use xpath to split into individual item
string xPath = @"items/item";
XmlNodeList nodeList = document.SelectNodes(xPath);

// loop through each item
for (int nodeIndex = 0; nodeIndex < nodeList.Count; nodeIndex++)
{
    // do something with the XmlNode
    nodeList[nodeIndex];
}

This works fine, but I think using a XmlReader would be faster. So I've written this:

XmlReader xmlReader = XmlReader.Create(new StringReader(xml));

while (xmlReader.Read())
{                       
   if (xmlReader.Name.Equals("item") && (xmlReader.NodeType == XmlNodeType.Element))
   {
      string id = xmlReader.GetAttribute("id");                 
      string desc = xmlReader.GetAttribute("desc");
      string elementXml = xmlReader.ReadOuterXml();
   }
}

However, this code only reads the first <item> element. The ReadOuterXml() is breaking the loop. Does anybody know how to get around this? Or is this type of parsing not possible with a XmlReader? I've having to do this using .NET version 2 :( So I can't use LINQ.

like image 301
Matt Puleston Avatar asked Nov 30 '12 09:11

Matt Puleston


2 Answers

Just tested your code in LinqPad. Works well.

 var xml = @"<items>
   <item id='1' desc='one' />
   <item id='2' desc='two' />
   <item id='3' desc='three' />
   <item id='4' desc='four' />
</items>";
XmlReader xmlReader = XmlReader.Create(new StringReader(xml));

while (xmlReader.Read())
{   
   if (xmlReader.Name.Equals("item") && (xmlReader.NodeType == XmlNodeType.Element))
   {
      string id = xmlReader.GetAttribute("id");              
      string desc = xmlReader.GetAttribute("desc");
      Console.WriteLine("{0} {1}", id, desc);
   }
}

Output:

1 one
2 two
3 three
4 four

Maybe there is something wrong with your XML.

like image 165
Oleg Avatar answered Sep 29 '22 12:09

Oleg


The following seems to work :-

        StringBuilder xml = new StringBuilder();

        xml.Append("<items>");
        xml.Append("<item id=\"1\" desc=\"one\">");
        xml.Append("<itembody id=\"10\"/>");
        xml.Append("</item>");
        xml.Append("<item id=\"2\" desc=\"two\">");
        xml.Append("<itembody id=\"20\"/>");
        xml.Append("</item>");
        xml.Append("<item id=\"3\" desc=\"three\">");
        xml.Append("<itembody id=\"30\"/>");
        xml.Append("</item>");
        xml.Append("</items>");

        using (XmlTextReader tr = new XmlTextReader(new StringReader(xml.ToString())))
        {
            bool canRead = tr.Read();
            while (canRead)
            {
                if ((tr.Name == "item") && tr.IsStartElement())
                {
                    Console.WriteLine(tr.GetAttribute("id"));
                    Console.WriteLine(tr.GetAttribute("desc"));
                    string outerxml = tr.ReadOuterXml();
                    Console.WriteLine(outerxml);

                    canRead = (outerxml != string.Empty);
                }
                else
                {
                    canRead = tr.Read();
                }
            }
        }
like image 20
Paul Diston Avatar answered Sep 29 '22 11:09

Paul Diston