Question

A week ago I got in a situation where I had to read a binary serialized object made by another application made by somebody else. I only had the someSerializedData.bin file, so I tried to manually recreate the class definition for the unknown object and I was able to do so, because of the metadata in the serialized file. Oddly, I couldn't find any tool on google.

Q1: Why is there no tool that recreates the class definition from a binary serialized file/data?

And it leads to my second question

Q2: Is there such case when it's impossible to restore the class definition from the serialized data? (Assuming it is not encrypted or obfuscated in any way, I'm interested in cases involving the "default" .NET Binaryserializer properties, to disable type information and metadata included)

Was it helpful?

Solution 3

The reason that no tool exists is because it's often not enough to create a type that only contains the data. The methods are often just as important as the data, especially with properties that don't just set their private variables. No one knows what those methods are.

With that said, it may be useful to have a tool that is at least able to generate a type to hold the data. Maybe you'll be the first one to create such a tool?

OTHER TIPS

It is impossible to deserialize binary data without knowing what's in it. The only way to do this is serializing it using JSON or XML for example. An example to illustrate:

Your name "Casual" can be serialized in this way: 67,97,115,117,97,108. In case you didn't notice: this is done using the ASCII coding (if I didn't make any mistakes). So now, imagine you don't know this is done with ASCII, who says this is not just an array with numbers? Or 3 arrays of 2 numbers? Or an object with ID 67 and an object with ID 117. Nobody knows so your task is impossible.

The only option is communicating with the person who serialized it originally and asks him/her how this is done and what objects are serialized in this binary object.

Kind regards

Q1: Why is there no tool that recreates the class definition from a binary serialized file/data?

My guess is that very few people need this. To start with, binary serialization isn't as popular as XML, JSON and other formats which are standardized and are supported virtually anywhere.

There's no documentation on the binary format. One needs to dig into .NET Framework sources to understand it. It's not fun.

Q2: Is there such case when it's impossible to restore the class definition from the serialized data?

Looks like the binary format contains enough data. If you absolutely need a tool to reverse engineer original classes and their fields from the serializied files, you can start with reading sources of System.Runtime.Serialization.Formatters.Binary.BinaryFormatter, System.Runtime.Serialization.Formatters.Binary.ObjectReader and other classes from mscorlib.

However, if the application which produced the files isn't obfuscated, I suggest trying to decompile it first. It will likely be much easier.

P.S. Don't forget to consult your lawyer.

I am not sure there's enough information in the metadata to re-create the type. Imagine complex (like nested) object graphs. In your previous question, member types (String vs int) have been an issue.

Regarding your second question, I am not sure what you try to achieve. I am not sure if you can use the BinaryFormatter to output data in a way that is not too easy to reverse engineer, but other methods should be simple to implement.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top