سؤال

Question: How common is it for a developer to create their own serializing format? Specifically, I'm using java essentially send the object as a giant string with tokens to delimit variables.

My Logic: I chose this because it pretty much eliminates language dependency (ignoring java's modified UTF-8), also you don't have the object version problem where if you use java's serializing the receiving end has to have the exact same version of the object so a client running on an older version wouldn't be able to receive any of the objects data. The code isn't too ugly and it reads ok, but I guess my question is what are the best practices for this instance? This is for a personal project.

Other known choices: Okay, I was just working with serializing an object to send it over the network and had come across googles protocol buffers. How standardized is serializing an object? I've essentially come across three ways to do it. (I'm going to talk about java here since that's what I did it for) 1) Use the language's(java's) native serializing classes 2) Use your own way of serializing the object possibly using strings and tokens 3)use Protocol Buffers or some other known format(JSON, XML, etc)

From what I've gathered you essentially have 3 main goals to achieve when serializing: 1) Speed/efficiency/size 2) Language independence 3) Version acceptance (in that old versions of the code can still accept parts of the new version & vice versa)

Do most large software projects use protocol buffers? Does it change if your client is a mobile device with much less resources?

هل كانت مفيدة؟

المحلول

If you use a standard format (JSON, XML, or even proto buffers), there will be far more opportunities for extending your app via integration points. But if it's just internal, then do what's easy. Personally, I create a dedicated persistent proxy class that represents the serialized form of a given object. Then serialize that object using whatever method makes the most sense (Java serialization for over-the-wire live transfers, xml for long term persistence) using writeReplace and readResolve. As the class evolves, I can create completely new implementations of the persistent proxy, add versioning to the proxy, etc... as appropriate. I believe that Bloch discusses this pattern in Effective Java.

As for coming up with a purely from-scratch wire protocol, it's really dependent on how important the performance is to an app. As with most things, the more you can leverage standard libraries/protocols, the faster you can get new code out the door. When I see a huge chunk of code involved with serializing/etc... I generally consider it a code smell and pay very close attention to whether it was justified or not. Just my $0.02.

And PS - someone posted a question about graphs... This is actually one area where I have intentionally avoided standard serialization. Java's ability to serialize complex graphs is not great - you wind up with stack overflow problems (hah) if the graph is even remotely complex. In these cases, a persistent proxy is very, very important.

نصائح أخرى

Question: How common is it for a developer to create their own serializing format?

Assuming that you mean create it from scratch, then the answer is "pretty uncommon".

Also, I'd say it is not "best practice" to do this in general. In most cases one of the existing commonly used alternatives (Java serialization, JSON, XML, etc) provide a good solution.

IMO, you should only consider "rolling your own" format (and implementing the corresponding serialization / deserialization code) if you have a clear requirement to do so, or if you have clear evidence that the existing alternatives won't work. Folk wisdom that "XYZ is slow" is not sufficient evidence.

Several things that immediately spring to my mind:

  • Include version number either to every message or to connection negotiation. It will save you from huge headaches. Preferably let the the sender know what version the receiver supports.

  • Unless you send data that is naturally binary (images, sound), use a readable plain-text (UTF-8) format. It will help troubleshooting a lot. I'd stick to JSON, but it may not be optimal for you. XML has high overhead.

  • If your messages are long enough, you can try to compress them with some well-known algorithm, like gzip (GZIPOutputStream).

These steps, as you can see, promote format openness and loose coupling between your server and your possible clients. No one knows what technology your clients will have to use in the future: care to make an iPhone client? a HTML5+JS client? Ditto for server :)

For structured data Xml, JSon is likely to be the best choice. However for simple flat records I suggest you consider CSV. You may find its simpler/faster to implement. If you have a very large number of records they can be easier to manage a well. e.g. load into a spread sheet

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top