I have the following C# code which is supposed to serialize arbitrary objects to a string, and then of course deserialize it.
public static string Pack(Message _message)
{
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream original = new MemoryStream();
MemoryStream outputStream = new MemoryStream();
formatter.Serialize(original, _message);
original.Seek(0, SeekOrigin.Begin);
DeflateStream deflateStream = new DeflateStream(outputStream, CompressionMode.Compress);
original.CopyTo(deflateStream);
byte[] bytearray = outputStream.ToArray();
UTF8Encoding encoder = new UTF8Encoding();
string packed = encoder.GetString(bytearray);
return packed;
}
public static Message Unpack(string _packed_message)
{
UTF8Encoding encoder = new UTF8Encoding();
byte[] bytearray = encoder.GetBytes(_packed_message);
BinaryFormatter formatter = new BinaryFormatter();
MemoryStream input = new MemoryStream(bytearray);
MemoryStream decompressed = new MemoryStream();
DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress);
deflateStream.CopyTo(decompressed); // EXCEPTION
decompressed.Seek(0, SeekOrigin.Begin);
var message = (Message)formatter.Deserialize(decompressed); // EXCEPTION 2
return message;
}
But the problem is that any time the code is ran, I am experiencing an exception. Using the above code and invoking it as shown below, I am receiving InvalidDataException: Unknown block type. Stream might be corrupted. at the marked // EXCEPTION
line.
After searching for this issue I have attempted to ditch the deflation. This was only a small change: in Pack
, bytearray
gets created from original.ToArray()
and in Unpack
, I Seek()
input
instead of decompressed
and use Deserialize(input)
instead of decompressed
too. The only result which changed: the exception position and body is different, yet it still happens. I receive a SerializationException: No map for object '201326592'. at // EXCEPTION 2
.
I don't seem to see what is the problem. Maybe it is the whole serialization idea... the problem is that somehow managing to pack the Message
instances is necessary because these objects hold the information that travel between the server and the client application. (Serialization logic is in a .Shared DLL project which is referenced on both ends, however, right now, I'm only developing the server-side first.) It also has to be told, that I am only using string
outputs because right now, the TCP connection between the servers and clients are based on string read-write on the ends. So somehow it has to be brought down to the level of strings.
This is how the Message object looks like:
[Serializable]
public class Message
{
public MessageType type;
public Client from;
public Client to;
public string content;
}
(Client
right now is an empty class only having the Serializable
attribute, no properties or methods.)
This is how the pack-unpack gets invoked (from Main()
...):
Shared.Message msg = Shared.MessageFactory.Build(Shared.MessageType.DEFAULT, new Shared.Client(), new Shared.Client(), "foobar");
string message1 = Shared.MessageFactory.Pack(msg);
Console.WriteLine(message1);
Shared.Message mess2 = Shared.MessageFactory.Unpack(message1); // Step into... here be exceptions
Console.Write(mess2.content);
Here is an image showing what happens in the IDE. The output in the console window is the value of message1
.
Some investigation unfortunately also revealed that the problem could lie around the bytearray
variable. When running Pack()
, after the encoder creates the string, the array contains 152 values, however, after it gets decoded in Unpack()
, the array has 160 values instead.
I am appreciating any help as I am really out of ideas and having this problem the progress is crippled. Thank you.
I would like to thank everyone answering and commenting, as I have reached the solution. Thank you.
Marc Gravell was right, I missed the closing of deflateStream
and because of this, the result was either empty or corrupted. I have taken my time and rethought and rewrote the methods and now it works flawlessly. And even the purpose of sending these bytes over the networked stream is working too.
Also, as Eric J. suggested, I have switched to using ASCIIEnconding
for the change between string
and byte[]
when the data is flowing in the Stream
.
The fixed code lies below:
public static string Pack(Message _message)
{
using (MemoryStream input = new MemoryStream())
{
BinaryFormatter bformatter = new BinaryFormatter();
bformatter.Serialize(input, _message);
input.Seek(0, SeekOrigin.Begin);
using (MemoryStream output = new MemoryStream())
using (DeflateStream deflateStream = new DeflateStream(output, CompressionMode.Compress))
{
input.CopyTo(deflateStream);
deflateStream.Close();
return Convert.ToBase64String(output.ToArray());
}
}
}
public static Message Unpack(string _packed)
{
using (MemoryStream input = new MemoryStream(Convert.FromBase64String(_packed)))
using (DeflateStream deflateStream = new DeflateStream(input, CompressionMode.Decompress))
using (MemoryStream output = new MemoryStream())
{
deflateStream.CopyTo(output);
deflateStream.Close();
output.Seek(0, SeekOrigin.Begin);
BinaryFormatter bformatter = new BinaryFormatter();
Message message = (Message)bformatter.Deserialize(output);
return message;
}
}
Now everything happens just right, as the screenshot proves below. This was the expected output from the first place. The Server and Client executables communicate with each other and the message travels... and it gets serialized and unserialized properly.
In addition to the existing observations about Encoding vs base-64, note you haven't closed the deflate stream. This is important because compression-streams buffer: if you don't close, it may not write the end. For a short stream, that may mean it writes nothing at all.
using(DeflateStream deflateStream = new DeflateStream(
outputStream, CompressionMode.Compress))
{
original.CopyTo(deflateStream);
}
return Convert.ToBase64String(outputStream.GetBuffer(), 0,
(int)outputStream.Length);
Your problem is most probably in the UTF8 encoding. Your bytes are not really a character string and UTF-8 is a encoding with different byte lengths for characters. This means the byte array may not correspond to a correctly encoded UTF-8 string (there may be some bytes missing at the end for instance.)
Try using UTF16 or ASCII which are constant length encodings (the resulting string will likely contain control characters so it won't be printable or transmitable through something like HTTP or email.)
But if you want to encode as a string it is customary to use UUEncoding to convert the byte array into a real printable string, then you can use any encoding you want.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With