Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to parse deep XML values with C#

Tags:

c#

xml

Working in Visual Studio 2010, .NET v4

I'm a relative newbie to both C# and XML. Using website resources, I was able to work out a fairly simple approach to parsing XML for specific values. However, the XML became increasingly complex, and the method I'm using (sort of a 'navigate via iterations' technique) is starting to look rather absurd, with many nested reader.Read() calls. I'm certain there's a less embarrassing way, but when the only tool you have is a hammer...

So, the question is: What's a neat, clean way to parse the XML (see below), returning a list of all 'action' values, only from an item matching <item itemkind="Wanted"> and <phrase>SomeSearchString</phrase>?

Here is a fragment of the XML:

<formats>
  <items>
    <item itemkind="NotWanted">
      <phrase>Not this one</phrase>
      <actions>
        <action>
          <actionkind>SetStyle</actionkind>
          <parameter>Normal</parameter>
        </action>
        <action>
          <actionkind>SetMargins</actionkind>
          <parameter>0.25,0.25,1,4</parameter>
        </action>
      </actions>
    </item>

    <item itemkind="Wanted">
      <phrase>SomeSearchString</phrase>
      <actions>
        <action>
          <actionkind>Action 1</actionkind>
          <parameter>Param 1</parameter>
        </action>
        <action>
          <actionkind>Action 2</actionkind>
          <parameter>Param 2</parameter>
        </action>
        <action>
          <actionkind>Action 3</actionkind>
          <parameter>Param 3</parameter>
        </action>
      </actions>
    </item>
  </items>

  <styles>
    <style stylename="Normal">
      <fontname>Arial</fontname>
      <fontsize>10</fontsize>
      <bold>0</bold>
    </style>
    <style stylename="Heading">
      <fontname>fntame frhead</fontname>
      <fontsize>12</fontsize>
      <bold>1</bold>
    </style>
  </styles>
</formats>

And here's the code that I've arrived at. It does work, but, well, see for yourself. Please be gentle:

public static List<TAction> GetActionsForPhraseItem(string AFileName, string APhrase)
{
   List<TAction> list = new List<TAction>();
   string xmlactionkind = null;
   string xmlparameter = null;
   string match = null;

   // Search through XML items
   using (XmlReader reader = XmlReader.Create(AFileName))
   {
      if (reader.ReadToFollowing("items"))
      {
         while (reader.Read())
         {
            if (reader.ReadToFollowing("item"))
            {
               while (reader.Read())
               {
                  if (reader.NodeType == XmlNodeType.Element && reader.GetAttribute("itemkind") == "Phrase")
                  {
                     if (reader.ReadToFollowing("phrase"))
                     {
                        match = reader.ReadString();
                        if (match == APhrase)
                        {
                           if (reader.ReadToFollowing("actions"))
                           {
                              // Use a subtree to deal with just the aItemKind item actions
                              using (var SubTree = reader.ReadSubtree())
                              {
                                 bool HaveActionKind = false;
                                 bool HaveParameter = false;

                                 while (SubTree.Read())
                                 {
                                    if (SubTree.NodeType == XmlNodeType.Element && SubTree.Name == "actionkind")
                                    {
                                       xmlactionkind = SubTree.ReadString();
                                       HaveActionKind = true;
                                    }

                                    if (SubTree.NodeType == XmlNodeType.Element && SubTree.Name == "parameter")
                                    {
                                       xmlparameter = SubTree.ReadString();
                                       HaveParameter = true;
                                    }

                                    if ((HaveActionKind == true) && (HaveParameter == true))
                                    {
                                       TAction action = new TAction()
                                       {
                                          ActionKind = xmlactionkind,
                                          Parameter = xmlparameter
                                       };

                                       list.Add(action);
                                       HaveActionKind = false;
                                       HaveParameter = false;
                                    }
                                 }
                              }
                           }
                        }
                     }
                  }
               }
            }
         }
      }
   }
   return list;
}

Bearing in mind that I'm new to C#, I suspect that LINQ would be quite useful here, but so far I haven't been able to wrap my brain around it. Trying to learn too many new things at once, I imagine. Thanks in advance for any help (and constructive criticisms).

EDIT: This is the final working code I ended up with. Thanks everyone who responded!

public static List<TAction> GetActionsForPhraseItemTWO(string AFileName, string ASearchPhrase)
{
  List<TAction> list = new List<TAction>();
  var itemKind = "Wanted";
  var searchPhrase = ASearchPhrase;
  var doc = XDocument.Load(AFileName);
  var matches = doc.Descendants("item")
    .Where(x => x.Attribute("itemkind") != null &&
       x.Attribute("itemkind").Value == itemKind &&
       x.Descendants("phrase").FirstOrDefault() != null &&
       x.Descendants("phrase").FirstOrDefault().Value == searchPhrase)
    .SelectMany(x => x.Descendants("action"));
  foreach (var temp in matches)
  {
    TAction action = new TAction()
    {
      ActionKind = temp.Element("actionkind").Value.ToString(),
      Parameter = temp.Element("parameter").Value.ToString()
    };
    list.Add(action);
  }
  return list;
}
like image 778
Eric S. Avatar asked Sep 04 '13 22:09

Eric S.


People also ask

What is XML parser in C?

The Oracle XML parser for C reads an XML document and uses DOM or SAX APIs to provide programmatic access to its content and structure. You can use the parser in validating or nonvalidating mode. This chapter assumes that you are familiar with the following technologies: Document Object Model (DOM).

What is XML decoding?

XML Decoder: as the name suggests, it is a tool to decode the text which is already encoded for XML's predefined entities. The XML escape codes present in the text will be converted to their corresponding XML predefined entities. See XML predefined entities here.


2 Answers

var node = XDocument.Load(fname)
                    .XPathSelectElement("//item[@itemkind='Wanted']/phrase");
var text = node.Value;
like image 196
EZI Avatar answered Sep 28 '22 12:09

EZI


var val = XDocument.Load(filename) // OR  XDocument.Parse(xmlstring)
            .Descendants("item")
            .First(i => i.Attribute("itemkind").Value == "Wanted")
            .Element("phrase")
            .Value;
like image 24
I4V Avatar answered Sep 28 '22 10:09

I4V