Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does one parse XML files? [closed]

Tags:

c#

xml

People also ask

How are XML files parsed?

The XML DOM (Document Object Model) defines the properties and methods for accessing and editing XML. However, before an XML document can be accessed, it must be loaded into an XML DOM object. All modern browsers have a built-in XML parser that can convert text into an XML DOM object.

What are the two types of XML parser?

In PHP there are two major types of XML parsers: Tree-Based Parsers. Event-Based Parsers.

What does parse XML activity do?

Parse XML is a synchronous activity that takes a binary XML file or an XML string and converts it into an XML schema tree based on the XSD specified.


It's very simple. I know these are standard methods, but you can create your own library to deal with that much better.

Here are some examples:

XmlDocument xmlDoc= new XmlDocument(); // Create an XML document object
xmlDoc.Load("yourXMLFile.xml"); // Load the XML document from the specified file

// Get elements
XmlNodeList girlAddress = xmlDoc.GetElementsByTagName("gAddress");
XmlNodeList girlAge = xmlDoc.GetElementsByTagName("gAge"); 
XmlNodeList girlCellPhoneNumber = xmlDoc.GetElementsByTagName("gPhone");

// Display the results
Console.WriteLine("Address: " + girlAddress[0].InnerText);
Console.WriteLine("Age: " + girlAge[0].InnerText);
Console.WriteLine("Phone Number: " + girlCellPhoneNumber[0].InnerText);

Also, there are some other methods to work with. For example, here. And I think there is no one best method to do this; you always need to choose it by yourself, what is most suitable for you.


I'd use LINQ to XML if you're in .NET 3.5 or higher.


Use a good XSD Schema to create a set of classes with xsd.exe and use an XmlSerializer to create a object tree out of your XML and vice versa. If you have few restrictions on your model, you could even try to create a direct mapping between you model classes and the XML with the Xml*Attributes.

There is an introductory article about XML Serialisation on MSDN.

Performance tip: Constructing an XmlSerializer is expensive. Keep a reference to your XmlSerializer instance if you intend to parse/write multiple XML files.


If you're processing a large amount of data (many megabytes) then you want to be using XmlReader to stream parse the XML.

Anything else (XPathNavigator, XElement, XmlDocument and even XmlSerializer if you keep the full generated object graph) will result in high memory usage and also a very slow load time.

Of course, if you need all the data in memory anyway, then you may not have much choice.


Use XmlTextReader, XmlReader, XmlNodeReader and the System.Xml.XPath namespace. And (XPathNavigator, XPathDocument, XPathExpression, XPathnodeIterator).

Usually XPath makes reading XML easier, which is what you might be looking for.


I have just recently been required to work on an application which involved the parsing of an XML document and I agree with Jon Galloway that the LINQ to XML based approach is, in my opinion, the best. I did however have to dig a little to find usable examples, so without further ado, here are a few!

Any comments welcome as this code works but may not be perfect and I would like to learn more about parsing XML for this project!

public void ParseXML(string filePath)  
{  
    // create document instance using XML file path
    XDocument doc = XDocument.Load(filePath);

    // get the namespace to that within of the XML (xmlns="...")
    XElement root = doc.Root;
    XNamespace ns = root.GetDefaultNamespace();

    // obtain a list of elements with specific tag
    IEnumerable<XElement> elements = from c in doc.Descendants(ns + "exampleTagName") select c;

    // obtain a single element with specific tag (first instance), useful if only expecting one instance of the tag in the target doc
    XElement element = (from c in doc.Descendants(ns + "exampleTagName" select c).First();

    // obtain an element from within an element, same as from doc
    XElement embeddedElement = (from c in element.Descendants(ns + "exampleEmbeddedTagName" select c).First();

    // obtain an attribute from an element
    XAttribute attribute = element.Attribute("exampleAttributeName");
}

With these functions I was able to parse any element and any attribute from an XML file no problem at all!


In Addition you can use XPath selector in the following way (easy way to select specific nodes):

XmlDocument doc = new XmlDocument();
doc.Load("test.xml");

var found = doc.DocumentElement.SelectNodes("//book[@title='Barry Poter']"); // select all Book elements in whole dom, with attribute title with value 'Barry Poter'

// Retrieve your data here or change XML here:
foreach (XmlNode book in nodeList)
{
  book.InnerText="The story began as it was...";
}

Console.WriteLine("Display XML:");
doc.Save(Console.Out);

the documentation