Friday, May 15, 2015

The Effect of SerialVersionUID (SUID)

What is serialVersionUID?

Each serializable class must be associated with a SUID. During deserialization, the fully qualified class name and SUID will be used by Java Serialization runtime to verifies if the loaded classes for the serialized form are compatible with respect to serialization. If the loaded class name and SUID do not match to the serialized form’s class name and SUID, it will throw java.io.InvalidaClassException. We can declare SUID explicitly for our serializable class. Below is SUID signature.

<ANY-ACCESS-MODIFIER> static final long serialVersionUID = <VALUE>;

If we do not do this explicitly, then during class compilation, Java Serialization runtime will calculate a default SUID for that class automatically based on various aspects of the class, as described in the Java(TM) Object Serialization Specification.

Why should I explicitly declare serialVersionUID?

Default SUID calculation is based on class definition. This calculation is highly sensitive to any class definition changes. Changes such as, add/remove method, add/remove instance field will create a brand new SUID value.

Let’s say, we do not explicitly declare SUID for our class, Person.class. We then serialized a "Person" object and persisted it into a file, called Person.ser. After we added a new method, compiled and get the updated Person.class, this class no longer able to deserializes the Person.ser because although the loaded class name is matched but the SUID is different. What we get is java.io.InvalidaClassException.

Even though we do not make any changes to the class, there is still having the same risk when we compile the Person.class using different Java Compiler. Different Java Compiler could have different algorithm to calculates SUID. In short, strongly recommended to explicit declare SUID in order to avoid the issue mentioned above.

What value should I put for serialVersionUID?

Some people prefer value as simple as 1L; some use certain format, for example, date time format 20140620010100L; some prefer to use Java provided serialver tool to generates SUID value. Well, to be precise, serialver is not to generates SUID value for a class. It is used to inspect SUID of a class. Let’s prove that.

We have a Person.class as below:

class Person implements Serializable {
    private static final long serialVersionUID = 1L;
    String name;
}

When we run the serialver <command, this is what we get.


If we change the SUID value to 2L and run the serialver command again, it will prints out 2L as our class's SUID.


Let's say we remove the SUID in order to use the default SUID.

class Person implements Serializable {
    String name;
}

No matter how many times we run the serialver command, we still get the same default SUID value.


Now, we add a new method into Person.

class Person implements Serializable {

    String name;

    String getName() {
        return name;
    }
}

Run the serialver command and we get a brand new default SUID.


As a conclusion, serialver is used to inspect class SUID no matter it is explicitly declared or calculated by default. However, by copying the SUID printed by serialver and paste it into our class, it will permanently become our class's SUID.

If you ask me what is my preference, I would prefer value with date time format because it is simple, manageable and informative. Moreover, I didn't see any different in using complex SUID.

When should we change the serialVersionUID?

The name of serialVersionUID has the implication of version control on class evolution. This is obvious when we rely on Java Serialization runtime to calculates SUID for our class. A new SUID will be calculated every time the class definition is changed to indicate new changes made to the class, even though the changes do not break the serialized form compatibility. However, simple intention could turn into a nightmare if we need to support that old version serialized forms which had been persisted with original SUID. The class with new SUID no longer able to deserializes those persisted serialized forms. To avoid this issue, strongly encourage to declare SUID explicitly.

Since we have declared SUID explicitly for our class, no matter our changes to the class are compatible or incompatible, our SUID will never change. In this case, we will always able to deserialize those persisted serialized forms, as they are having the same version as the loaded class. In order to support backward compatibility, we have to try our best to deal with any incompatible changes that we introduced via special handling methods such as readObject(java.io.ObjectInputStream) which is recognized by Java Serialization runtime. Well, this bring us back to the same question. When should we change the explicitly declared SUID?

When the changes made to the class is too big, the effort for handling backward compatible is much greater than re-writing the class, and we have no choice but to give up all persisted serialized forms in order to make thing right and save maintenance cost. This situation happen mostly because of our serializable class is not proper design at the first place. Make-it-work-and-refactor-later approach is not working when come to serialization. Never simply implements serializable for a class without proper design, else once the class is released, it could penalize you long term maintenance cost. Read Designing serialized form to learn more about what we can do in designing serializable class.

Think twice to measure the impact and benefits you will get, before you really make the decision to change the SUID. After SUID is changed, your persisted serialized forms will officially become old version. You can only deserialize them using old version class.

Because of all these facts, I don't think serialVersionUID is meant to be changed. I rather recognize the serialVersionUID as Stream Unique ID than version control ID for class.


References:
Serializable API Doc
Java(TM) Object Serialization Specification

No comments: