Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Compare XML fragments & return differences

Tags:

c#

compare

xml

I have an audit list full of serialized objects, and I'd like to compare them and return a list of the differences. By 'compare' I mean I want to return where the text for an element has changed, or where a node has been added (so its not in Xml1, but it is in Xml2- it won't happen the other way around)

Sample xml:

<HotelBookingView xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Id>119</Id>
  <RoomId>1</RoomId>
  <ChangeRequested>false</ChangeRequested>
  <CourseBookings>      
    <CourseHotelLink>
      <Id>0</Id>
    </CourseHotelLink>
</CourseBookings>
</HotelBookingView>

The namespaces and the names/case of the tags will not change. All that can change in this sample is the values between the tags, and the number of 'CourseHotelLink's (its a serialized list).

The final result I would like is a list of which node has changed- the old value and the new value.

What is the best option to compare them? I am using .Net 4.0 so linq is an option. I need to be able to do the comparison without necessarily knowing the names of all the nodes- though I will only ever compare two objects of the same type. I have been trying to use the following code, but I can't manage to adapt it to pick out changes in text as well as extra nodes.

XmlDocument Xml1 = new XmlDocument();
XmlDocument Xml2 = new XmlDocument();
Xml1.LoadXml(list[1].Changes);
Xml2.LoadXml(list[2].Changes);
foreach (XmlNode chNode in Xml2.ChildNodes)
{
   CompareLower(chNode);
}

protected void CompareLower(XmlNode aNode)
{
    foreach (XmlNode chlNode in aNode.ChildNodes)
    {
        string Path = CreatePath(chlNode);
        if (chlNode.Name == "#text")
        {
            //all my efforts at comparing text have failed
            continue;
        }
        if (Xml1.SelectNodes(Path).Count == 0)
        {
            XmlNode TempNode = Xml1.ImportNode(chlNode, true);
            //node didn't used to exist, this works- though doesn't return values
            str = str + "New Node: " + TempNode.Name + ": " + TempNode.Value;
        }
        else
        {
            CompareLower(chlNode);
        }
    } 
}

Its likely my code attempts are miles off and there is a much better way to do, any suggestions welcome!

EDITTED to add: I ended up using the MS Xml Diff Tool, the following code produces a big html table listing of the two xml nodes, with the differences highlighted in green. So its possible (though insane) to produce the html, then sort through it to find the text 'lightgreen' (the highlighted value), then do some string formations to display only the changed child-node.

var node1 = XElement.Parse("Xml string 1 here").CreateReader();
var node2 = XElement.Parse("Xml string 2 here").CreateReader();

MemoryStream diffgram = new MemoryStream();
XmlTextWriter diffgramWriter = new XmlTextWriter(new StreamWriter(diffgram));

XmlDiff xmlDiff = new XmlDiff(XmlDiffOptions.IgnoreChildOrder);
xmlDiff.Algorithm = XmlDiffAlgorithm.Fast;
xmlDiff.Compare(node1, node2,diffgramWriter);

diffgram.Seek(0, SeekOrigin.Begin);
XmlDiffView xmlDiffView = new Microsoft.XmlDiffPatch.XmlDiffView();
StringBuilder sb = new StringBuilder();
TextWriter resultHtml = new StringWriter(sb);
xmlDiffView.Load("Xml string 1", new XmlTextReader(diffgram)); 

xmlDiffView.GetHtml(resultHtml);
resultHtml.Close();
like image 714
UglyTeapot Avatar asked May 07 '12 12:05

UglyTeapot


1 Answers

Using XMlDiff is the way to go - to prove it here's some working code. I'm using your XML. If the XML is different (or invalid), this may not work.

Original:

var xml1 = @"<HotelBookingView xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xmlns:xsd=""http://www.w3.org/2001/XMLSchema"">
<Id>119</Id>
<RoomId>1</RoomId>
<ChangeRequested>false</ChangeRequested>
<CourseBookings>      
    <CourseHotelLink>
    <Id>0</Id>
    </CourseHotelLink>
</CourseBookings>
</HotelBookingView>";

Different Id value in CourseBookings:

var xml2 = @"<HotelBookingView xmlns:xsi=""http://www.w3.org/2001/XMLSchema-instance"" xmlns:xsd=""http://www.w3.org/2001/XMLSchema"">
<Id>119</Id>
<RoomId>1</RoomId>
<ChangeRequested>false</ChangeRequested>
<CourseBookings>      
    <CourseHotelLink>
    <Id>1</Id>
    </CourseHotelLink>
</CourseBookings>
</HotelBookingView>";

Low effort way of creating readers (change to XDocument if needed):

var node1 = XElement.Parse(xml1).CreateReader();
var node2 = XElement.Parse(xml2).CreateReader();

Prepare the result writer:

var result = new XDocument();
var writer = result.CreateWriter();

Do the diff:

var diff = new Microsoft.XmlDiffPatch.XmlDiff();    
diff.Compare(node1, node2, writer);
writer.Flush(); 
writer.Close();

result is now an XDocument that contains a summary of the differences:

<xd:xmldiff version="1.0" srcDocHash="14506386314386767543" options="None" fragments="no" xmlns:xd="http://schemas.microsoft.com/xmltools/2002/xmldiff">
  <xd:node match="1">
    <xd:node match="4">
      <xd:node match="1">
        <xd:node match="1">
          <xd:change match="1">1</xd:change>
        </xd:node>
      </xd:node>
    </xd:node>
  </xd:node>
</xd:xmldiff>
like image 56
yamen Avatar answered Oct 05 '22 12:10

yamen