I'm getting two different hashes of the same xml document when I directly canonicalize some xml than when I perform a digital signature on it which also performs the same canonicalization algoririth on the xml before hashing it? I worked out that the digital signature canonicalization includes the new line characters '\n' and spacing characters when canonicalizing and the direct algorithm does not.
Including the new line characters + spaces is not in the canonicalization specification though? I'm specifically looking at this version http://www.w3.org/TR/2001/REC-xml-c14n-20010315
Does anyone know what is going on? I've included the xml doc and both implementations of the code so you can see.
This is really puzzling me and I'd like to know why, am I missing something obvious?
<root>
<child1>some text</child1>
<child2 attr="1" />
</root>
The direct canonicalization code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Security.Cryptography.Xml;
using System.Security.Cryptography;
using System.IO;
using System.ComponentModel;
namespace XML_SignatureGenerator
{
class XML_C14N
{
private String _filename;
private Boolean isCommented = false;
private XmlDocument xmlDoc = null;
public XML_C14N(String filename)
{
_filename = filename;
xmlDoc = new XmlDocument();
xmlDoc.Load(_filename);
}
//implement this spec http://www.w3.org/TR/2001/REC-xml-c14n-20010315
public String XML_Canonalize(System.Windows.Forms.RichTextBox tb)
{
//create c14n instance and load in xml file
XmlDsigC14NTransform c14n = new XmlDsigC14NTransform(isCommented);
c14n.LoadInput(xmlDoc);
//get canonalised stream
Stream s1 = (Stream)c14n.GetOutput(typeof(Stream));
SHA1 sha1 = new SHA1CryptoServiceProvider();
Byte[] output = sha1.ComputeHash(s1);
tb.Text = Convert.ToBase64String(output);
//create new xmldocument and save
String newFilename = _filename.Substring(0, _filename.Length - 4) + "C14N.xml";
XmlDocument xmldoc2 = new XmlDocument();
xmldoc2.Load(s1);
xmldoc2.Save(newFilename);
return newFilename;
}
public void set_isCommented(Boolean value)
{
isCommented = value;
}
public Boolean get_isCommented()
{
return isCommented;
}
}
}
The xml digital signature code
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Xml;
using System.Security.Cryptography;
using System.Security.Cryptography.Xml;
namespace XML_SignatureGenerator
{
class xmlSignature
{
public xmlSignature(String filename)
{
_filename = filename;
}
public Boolean SignXML()
{
RSACryptoServiceProvider rsa = new RSACryptoServiceProvider();
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.PreserveWhitespace = true;
String fname = _filename; //"C:\\SigTest.xml";
xmlDoc.Load(fname);
SignedXml xmlSig = new SignedXml(xmlDoc);
xmlSig.SigningKey = rsa;
Reference reference = new Reference();
reference.Uri = "";
XmlDsigC14NTransform env = new XmlDsigC14NTransform(false);
reference.AddTransform(env);
xmlSig.AddReference(reference);
xmlSig.ComputeSignature();
XmlElement xmlDigitalSignature = xmlSig.GetXml();
xmlDoc.DocumentElement.AppendChild(xmlDoc.ImportNode(xmlDigitalSignature, true));
xmlDoc.Save(Environment.GetFolderPath(Environment.SpecialFolder.Desktop) + "/SignedXML.xml");
return true;
}
private String _filename;
}
}
Any idea would be great! It's all C# code by the way.
Thanks in advance
Jon
In computer science, canonicalization (sometimes standardization or normalization) is a process for converting data that has more than one possible representation into a "standard", "normal", or canonical form.
Introduction to XML Signatures You can use an XML Signature to sign any arbitrary data, whether it is XML or binary. The data is identified via URIs in one or more Reference elements. XML Signatures are described in one or more of three forms: detached, enveloping, or enveloped.
The <SignatureValue> element is a subelement of the <Signature> element. Use the SignatureValue property to retrieve the value of the XML digital signature. This property is automatically populated when you make a successful call to the ComputeSignature method.
An XML digital signature (XML DSIG) is an electronic, encrypted, stamp of authentication on digital information such as messages. The digital signature confirms that the information originated from the signer and was not altered in transmission.
The way in which XML Sig handles whitespace is, in my opinion broken. It's certainly not compliant with what most right-thinking people would call canonicalization. Changing whitespace should not affect the digest, but in xmlsig, it does.
One possible workaround is to pass the document through a canonicalizer routine before passing it to the signature generation code. That should make things far more predictable.
This article might help clarify things.
It looks like in your second piece of code you have
xmlDoc.PreserveWhitespace = true;
while in the first you do not.
As I understand it, the canonicalisation specification asks to preserve the whitespace between elements, so I suggest you include this line in both.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With