explicit serialVersionUID considered harmful?

https://stackoverflow.com/questions/8882046

16-04-2021
|

Question

I am probably risking some downvotes on this.

It seems to me that explicitly specifying serialVersionUID for new classes is bad. Consider the two cases of not changing it when layout has it should have been changed and changing it when it should not have.

Not changing when it should have been changed occurs almost only when it is explicit. In this case, it results in some very subtle, hard-to-find bugs. Especially during development, when class layout changes often. But if it has not been explicitly specified, it will change and the deserialization will break loudly, mostly likely solved by purging the repository.

Changing it when should not have would occur almost only when it is implicit. This is the rare case where class layout has changed but we still want to deserialize from the old serialized blobs. This will likely be caught during QA (Strange errors after upgrade from 5.2 to 5.2.1, see attached stack trace) and can be trivially fixed by setting a explicit value.

Comments?

La solution

Changing when it shouldn't may happen for reasons other than class layout changes - the problem is that it's compiler implementation dependent. If you do debug with Eclipse but do production builds with javac, you may end up with two incompatible sets of data.

Autres conseils

At my work we explicitly prohibit specifying serialVersionUID, exactly because of the problems you bring up.

In addition, the classes we persist are only used to store data with no logic inside, so the only way they change is because of changing data members.

to further emphasize what john skeet said and to contradict the comment:

"If you don't need that (i.e. you always serialize and de-serialize with the same class version), then you can safely skip the explicit declaration"

Even if you are not serializing long-term and are using the same class version, you could still have issues. if you are writing client-server code and the client code could run w/ a different jvm version/implementation than the server you can have the same problems with incompatible serialversionuids.

to summarize, the only time it is "safe" to not specify serialversionuids is when you are not serializing long-term and you are guaranteed that all consumers of the serialized data will be using the same jvm implementation and version as the original producer.

in short, not using serialversionuid is generally the more harmful situation.

When you need to support long-time persistence via serialization, then you almost always need to use custom code to support this and need to explicitly set the serialVersionUID, as otherwise older serialized version will not be de-serializable by newer code.

Those scenarios already require a great deal of care in order to get all the cases correct, when the class changes, so the serialVersionUID is the least of your problems.

If you don't need that (i.e. you always serialize and de-serialize with the same class version), then you can safely skip the explicit declaration, as the computed value will make sure that the correct version is used.

Whether you go for serialVersionUID or not (I suggest you do), then you should really consider creating a comprehensive set of tests for serial compatibility.

It's also worth designing the serial format with care. It is effectively a public API.

If you're just using serialization for a remote method call, e.g. calling an EJB, where the client and server classes and jvm are the same, which I suspect is by far the most common use, then setting serialVersionUID explicitly (as for example eclipse suggests) is likely to cause you significant pain in form of occasional, inexplicable bugs where incompatible class instances are treated as compatible because of the fixed serialVersionUID. Remote calls will silently go wrong during low-level serialization and the problem will only crop up later when your object's state is inconsistent. You find the source of the problem only when you realize your client and server classes are somehow different (though serialVersionUID is of course not). In my experience setting serialVersionUID for this reason does more harm than good.

If, on the other hand, you explicitly set serialVersionUID to read in old data, you are by definition reading in an incompatible version and are likely to end up with an object in an inconsistent or incomplete state. In this case setting serialVersionUID is a workaround for a different problem.

Licencié sous: CC-BY-SA avec attribution

Non affilié à StackOverflow