Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to add new node to end of an xml?

Tags:

c#

.net

xml

I have a large xml file (approx. 10 MB) in following simple structure:

<Errors>
   <Error>.......</Error>
   <Error>.......</Error>
   <Error>.......</Error>
   <Error>.......</Error>
   <Error>.......</Error>
</Errors>

My need is to write add a new node <Error> at the end before the </Errors> tag. Whats is the fastest way to achieve this in .net?

like image 857
Ramesh Soni Avatar asked May 11 '09 17:05

Ramesh Soni


People also ask

Which line of code adds a new node into an XML file?

The appendChild() method adds a child node to an existing node. The new node is added (appended) after any existing child nodes.

How do I change node in XML?

Nodes are inserted with insertBefore(), insertAfter() or appendChild(). You can replace one node with another using replaceChild() and remove a node with removeChild().


2 Answers

You need to use the XML inclusion technique.

Your error.xml (doesn't change, just a stub. Used by XML parsers to read):

<?xml version="1.0"?>
<!DOCTYPE logfile [
<!ENTITY logrows    
 SYSTEM "errorrows.txt">
]>
<Errors>
&logrows;
</Errors>

Your errorrows.txt file (changes, the xml parser doesn't understand it):

<Error>....</Error>
<Error>....</Error>
<Error>....</Error>

Then, to add an entry to errorrows.txt:

using (StreamWriter sw = File.AppendText("logerrors.txt"))
{
    XmlTextWriter xtw = new XmlTextWriter(sw);

    xtw.WriteStartElement("Error");
    // ... write error messge here
    xtw.Close();
}

Or you can even use .NET 3.5 XElement, and append the text to the StreamWriter:

using (StreamWriter sw = File.AppendText("logerrors.txt"))
{
    XElement element = new XElement("Error");
    // ... write error messge here
    sw.WriteLine(element.ToString());
}

See also Microsoft's article Efficient Techniques for Modifying Large XML Files

like image 131
tofi9 Avatar answered Nov 10 '22 10:11

tofi9


First, I would disqualify System.Xml.XmlDocument because it is a DOM which requires parsing and building the entire tree in memory before it can be appended to. This means your 10 MB of text will be more than 10 MB in memory. This means it is "memory intensive" and "time consuming".

Second, I would disqualify System.Xml.XmlReader because it requires parsing the entire file first before you can get to the point of when you can append to it. You would have to copy the XmlReader into an XmlWriter since you can't modify it. This requires duplicating your XML in memory first before you can append to it.

The faster solution to XmlDocument and XmlReader would be string manipulation (which has its own memory issues):

string xml = @"<Errors><error />...<error /></Errors>";
int idx = xml.LastIndexOf("</Errors>");

xml = xml.Substring(0, idx) + "<error>new error</error></Errors>";

Chop off the end tag, add in the new error, and add the end tag back.

I suppose you could go crazy with this and truncate your file by 9 characters and append to it. Wouldn't have to read in the file and would let the OS optimize page loading (only would have to load in the last block or something).

System.IO.FileStream fs = System.IO.File.Open("log.xml", System.IO.FileMode.Open, System.IO.FileAccess.ReadWrite);
fs.Seek(-("</Errors>".Length), System.IO.SeekOrigin.End);
fs.Write("<error>new error</error></Errors>");
fs.Close();

That will hit a problem if your file is empty or contains only "<Errors></Errors>", both of which can easily be handled by checking the length.

like image 39
Colin Burnett Avatar answered Nov 10 '22 11:11

Colin Burnett