Question

in terms of both parsing (serializing, deserializing) and sending packets over the network is there any good estimation of performance differences between binary and xml serialization?

Was it helpful?

Solution

Nope.

It depends highly on what sort of data is inside the XML document itself. If you have a lot of structured data, the overhead for XML will be large. For example, if your data looks like:

<person>
  <name>Dave</dave>
  <ssn>000-00-0000</ssn>
  <email1>xxxxxx/email1>
</person>
...

You'll have a lot more overhead than if you have an XML document that looks like:

<book name="bible">
 In the beginning God created the heavens and the earth. 
 Now the earth was formless and empty ... 
 And if any man shall take away from the words of the book of this prophecy, God shall take away his part out of the book of life, and out of the holy city, and from the things which are written in this book. He which testifieth these things saith, Surely I come quickly. Amen. Even so, come, Lord Jesus.
</book>

So it's not really a fair question. It depends highly on the data YOU intend to send, and how/if you're compressing it.

OTHER TIPS

The biggest difference between BinaryFormatter and xml serialization is portability; BinaryFormatter is hard to guarantee between versions, so is only really suitable for shortterm storage or transfer.

However, you can get the best of both, and have it smaller and have it quicker, by using bespoke binary serialization - and you don't even have to do it yourself ;-p

protobuf-net is a .NET implementation of Google's protocol buffers binary serialization spec; it is smaller than either XmlSerializer or BinaryFormatter, fully portable (not just between versions - you can load a pb stream in, for example, java, etc), extensible, and fast. It is also pretty comprehensively tested, with a fair number of users.

A full breakdown of size and speed, covering XmlSerializer, BinaryFormatter, DataContractSerializer and protobuf-net is here.

Instinctively you would want to say that binary is more efficient, but it actually depends on the data being serialized.

Check out this article: http://www.nablasoft.com/alkampfer/index.php/2008/10/31/binary-versus-xml-serialization-size/

Just pointing out the performance is not the only metric you may want to look at.

  • Ease of construction. Do you have several days/weeks to build a serialiser/deserialiser routine and test it thoroughly or could that time be better spent on features.
  • Ease of consuming the data. Can a client use a pre-built open-source parser or do they need to implement a bunch of (potentially buggy) code themselves?
  • Ease of debugging. Will being able to view the data in transit help to debug? Then a binary format will tend to obfuscate any issues.
  • What is the maintenance cost for each method?

Personally, I would use a published XML standard and open source parsing libraries until a performance bottleneck is proven by actual testing.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top