A week ago I got in a situation where I had to read a binary serialized object made by another application made by somebody else. I only had the someSerializedData.bin file, so I tried to manually recreate the class definition for the unknown object and I was able to do so, because of the metadata in the serialized file. Oddly, I couldn't find any tool on google.
Q1: Why is there no tool that recreates the class definition from a binary serialized file/data?
And it leads to my second question
Q2: Is there such case when it's impossible to restore the class definition from the serialized data? (Assuming it is not encrypted or obfuscated in any way, I'm interested in cases involving the "default" .NET Binaryserializer properties, to disable type information and metadata included)
It is impossible to deserialize binary data without knowing what's in it. The only way to do this is serializing it using JSON or XML for example. An example to illustrate:
Your name "Casual" can be serialized in this way: 67,97,115,117,97,108. In case you didn't notice: this is done using the ASCII coding (if I didn't make any mistakes). So now, imagine you don't know this is done with ASCII, who says this is not just an array with numbers? Or 3 arrays of 2 numbers? Or an object with ID 67 and an object with ID 117. Nobody knows so your task is impossible.
The only option is communicating with the person who serialized it originally and asks him/her how this is done and what objects are serialized in this binary object.
Kind regards
Q1: Why is there no tool that recreates the class definition from a binary serialized file/data?
My guess is that very few people need this. To start with, binary serialization isn't as popular as XML, JSON and other formats which are standardized and are supported virtually anywhere.
There's no documentation on the binary format. One needs to dig into .NET Framework sources to understand it. It's not fun.
Q2: Is there such case when it's impossible to restore the class definition from the serialized data?
Looks like the binary format contains enough data. If you absolutely need a tool to reverse engineer original classes and their fields from the serializied files, you can start with reading sources of System.Runtime.Serialization.Formatters.Binary.BinaryFormatter
, System.Runtime.Serialization.Formatters.Binary.ObjectReader
and other classes from mscorlib.
However, if the application which produced the files isn't obfuscated, I suggest trying to decompile it first. It will likely be much easier.
P.S. Don't forget to consult your lawyer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With