When .NET's BinaryFormatter
is used to serialize an object graph, is any type of compression applied?
I ask in the context of whether I should worry about the object graph having many repeated strings and integers.
Edit - Hold on, if strings are interned in .NET, there's no need to worry about repeated strings, right?
No, it doesn't provide any compression but you can compress the output yourself using the GZipStream
type.
Edit: Mehrdad has a wonderful example of this technique in his answer to How to compress a .net object instance using gzip.
Edit 2: Strings can be interned but that doesn't mean that every string is interned. I wouldn't make any assumptions on how or why the CLR decides to intern strings as this can change (and has changed) from version to version.
No, it does not, but...
I just added GZipStream support for my app today, so I can share some code here;
Serialization:
using (Stream s = File.Create(PathName))
{
RijndaelManaged rm = new RijndaelManaged();
rm.Key = CryptoKey;
rm.IV = CryptoIV;
using (CryptoStream cs = new CryptoStream(s, rm.CreateEncryptor(), CryptoStreamMode.Write))
{
using (GZipStream gs = new GZipStream(cs, CompressionMode.Compress))
{
BinaryFormatter bf = new BinaryFormatter();
bf.Serialize(gs, _instance);
}
}
}
Deserialization:
using (Stream s = File.OpenRead(PathName))
{
RijndaelManaged rm = new RijndaelManaged();
rm.Key = CryptoKey;
rm.IV = CryptoIV;
using (CryptoStream cs = new CryptoStream(s, rm.CreateDecryptor(), CryptoStreamMode.Read))
{
using (GZipStream gs = new GZipStream(cs, CompressionMode.Decompress))
{
BinaryFormatter bf = new BinaryFormatter();
_instance = (Storage)bf.Deserialize(gs);
}
}
}
NOTE: if you use CryptoStream, it is kinda important that you chain (un)zipping and (de)crypting right this way, because you'll want to lose your entropy BEFORE encryption creates noise from your data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With