Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient way in C# to merge more than 2 xml files with the same schema together?

Tags:

c#

xml

I have several fairly large XML files that represent data exported from a system that is to be used by a 3rd party vendor. I was chopping the results at 2,500 records for each XML file because the files become huge and unmanagable otherwise. However, the 3rd party vendor has asked me to combine all of these XML files into a single file. There are 78 of these XML files and they total over 700MB in size! Crazy, I know... so how would you go about combining these files to accomodate the vendor using C#? Hopefully there is a real efficient way to do this without reading in all of the files at once using LINQ :-)

like image 297
Rob Packwood Avatar asked Sep 10 '09 14:09

Rob Packwood


2 Answers

I'm going to go out on a limb here and assume that your xml looks something like:

<records>
  <record>
    <dataPoint1/>
    <dataPoint2/>
  </record>
</records>

If that's the case, I would open a file stream and write the <records> part, then sequentially open each XML file and write all lines (except the first and last) to disk. That way you don't have huge strings in memory and it should all be very, very quick to code and run.

public void ConsolidateFiles(List<String> files, string outputFile)
{
  var output = new StreamWriter(File.Open(outputFile, FileMode.Create));
  output.WriteLine("<records>");
  foreach (var file in files)
  {
    var input = new StreamReader(File.Open(file, FileMode.Open));
    string line;
    while (!input.EndOfStream)
    {
      line = input.ReadLine();
      if (!line.Contains("<records>") &&
          !line.Contains("</records>"))
      {
        output.Write(line);
      }
    }
  }
  output.WriteLine("</records>");
}
like image 168
JustLoren Avatar answered Oct 04 '22 02:10

JustLoren


Use DataSet.ReadXml(), DataSet.Merge(), and DataSet.WriteXml(). Let the framework do the work for you.
Something like this:

  public void Merge(List<string> xmlFiles, string outputFileName)
  {
     DataSet complete = new DataSet();

     foreach (string xmlFile in xmlFiles)
     {
        XmlTextReader reader = new XmlTextReader(xmlFile);
        DataSet current = new DataSet();
        current.ReadXml(reader);
        complete.Merge(current);
     }

     complete.WriteXml(outputFileName);
  }

For further description and examples, take a look at this article from Microsoft.

like image 21
Donut Avatar answered Oct 04 '22 03:10

Donut