Processing difference during serialization runtime
Basically, both of them has to construct and write metadata into the stream. This metadata then could be read back in order to re-constitute Java object during deserialization process. The metadata includes the class description for each of the class in the class hierarchy of the Java object that being serialized
For Serializable:
Below is a diagram that illustrates the serialization process against a Serializable Java object.
When a Serializable object is being serialized. Serialization runtime will traverse its class hierarchy from bottom to top, and constructs class description for each of the class in that class hierarchy. This class description contains the following information.
- Class Identity
- Stream Unique ID/serialVersionUID
- Fully Qualified Class Name
- Serializable Fields Information
- Number of fields
- Name and type of fields
- Others
- Operational flags
For Externalizable:
Below is a diagram that illustrates the serialization process against an Externalizable Java object.
When an Externalizable object is being serialized, the serialization runtime will also traverse its class hierarchy from bottom to top in order to construct class description. However, the class description for the externalizable object does not contain field information.
- Class Identity
- Stream Unique ID/serialVersionUID
- Fully Qualified Class Name
- Others
- Operational flags
Programmatically difference
Serializable is a marker interface
Serializable needs field information in order to re-constitute back the fields of the target object. This is done at the background and does not require any effort from a programmer. Even a novice Java programmer could create a Serializable class by merely implements `Serializable` interface and serialization just works like a magic! Below is an example of Serializable Person class.
import java.io.Serializable; import java.util.Date; public class Person implements Serializable { int id; String name; Date dob; }
Externalizable is a contractual interface
On another hand, Externalizable never captures field information. The question is, how Externalizable reconstitute back the fields of target object? Externalizable does not want to do the magic thing like Serializable. Instead, it delegates this task to programmer to determine what fields he/she would like to serialize and deserialize. In other words, field data is directly written into/read from stream by calling the Externalizable API that implemented by the class. Below is an example of Externalizable Person class.
As you can see, more line of codes is needed to be written to achieve the same thing done by Serializable. By the way, we could have Serializable and Externalizable class (in different class level) in the same class hierarchy of a target object. However, bear in mind that, the existence of Externalizable in any level of the class hierarchy will supersedes Serializable. This means that serialization runtime will expect to call the normal Externalizable methods instead of automatically serialize/deserialize the target object's fields.
import java.io.Externalizable; import java.io.IOException; import java.io.ObjectInput; import java.io.ObjectOutput; import java.util.Date; public class Person implements Externalizable { int id; String name; Date dob; @Override public void writeExternal(ObjectOutput out) throws IOException { out.writeInt(id); out.writeUTF(name); out.writeObject(dob); } @Override public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException { id = in.readInt(); name = in.readUTF(); dob = (Date) in.readObject(); } }
As you can see, more line of codes is needed to be written to achieve the same thing done by Serializable. By the way, we could have Serializable and Externalizable class (in different class level) in the same class hierarchy of a target object. However, bear in mind that, the existence of Externalizable in any level of the class hierarchy will supersedes Serializable. This means that serialization runtime will expect to call the normal Externalizable methods instead of automatically serialize/deserialize the target object's fields.
Outcome difference
Size different
Obviously, the serialized form generated by Serializable is bigger than Externalizable because it includes fields information. The program below is able to serialize Person object, where this object could be an instance of Serializable or Externalizable Person class that given in the previous section.
private static final String FILENAME = "D:/person.ser"; public static void main(String[] args) throws Exception { Person person = new Person(); person.id = 123; person.name = "HauChee"; person.dob = new Date(new Long("1429945971467")); serialize(person); File f = new File(FILENAME); try (FileChannel channel = FileChannel.open( Paths.get("D:/", "person.ser"), StandardOpenOption.READ)) { System.out.println(channel.size()); } } public static void serialize(Object obj) throws IOException { try (ObjectOutputStream out = new ObjectOutputStream(new FileOutputStream(FILENAME))) { out.writeObject(obj); } }
The result is the Serializable serialized form has the size of 174 bytes, which bigger than the Externalizable serialized form with the size of 118 bytes.
Processing speed different
Serializable will take longer time to inspect the object graph to construct the fields information during serialization process. It also take time to digest the fields information in order to construct back the objects during deserialization process. All these are done using Java Reflection, which commonly known with low performance.
Externalizable skips these kind of processing because it does not has the obligation as per Serializable. It is the programmer responsibility to provide the implementation to write/read field data into/from the stream. Serialization runtime basically just make normal method calls of the Externalizable API. As a result, Externalizable performs better in term of processing speed.
Tips:
Well, there is a way for Serializable to generate smaller serialized form and catch up the speed. This can be done by customizing the Serializable serialized form. Read designing serialized form for more detail.
Different in term of side effect toward programmer
Serializable giving out the fish directly
Serialiable is powerful to do all the hard work at the background. As a result, programmer will just take it for granted. Programmer tend to ignore to know more about Serializable because it works by default. They don't know that, there are ways to optimize and proper use of Serializable. Moreover, the problem brought up by inproper use of Serializable is not raised in the first place. Problems alway come after few releases where your classes have grown bigger (performance issue) or old algorithm has be to changed (backward compatibility issue). Read designing serialized form to know how to minimize to chances of getting such problems by designing your Serializable class.
Externalizable require programmer to do fishing
The thing will not work simply in Externalizable. Programmer has to determine which fields needed to be serialized and which is not. The good part is, programmer is forced to know the fields of the target object's superclasses because they would not be serialized by default. Most of the time, this part has been ignored by programmer when using Serializable, which may causing unnecessary data or duplicate data get written into stream. In short, little or not, Externalizable is giving rooms for programmer to think and design the desired serialized form.
Serializable or Externalizable?
This post so far seems like making Externalizable the one to be chosen. It avoid unnecessary metadata processing, and forcing programmer to learn and deal with the data serialization. However, Externalizable has a drawback that make you think twice before you use it. A class that has implemented Externalizable will have two additional public methods that callable from the outside world. In fact, these methods are only required in serialization runtime and not meant for other callers. Unintended call to these methods may change the state of the object. Worst case, it could become the weak point to be attacked.
I personally prefer and recommend Serializable. It is able to do what Externalizable could. It provides a few of special private methods for customizing serialized form to become lighten and efficient. Programmer who use Serializable has the options to choose default serialization or custom serialization, or even a mix of both of them.
One last point that I would like to emphasize here. Using Externalizable does not means that our serialized form is proper designed. In the case where a programmer just write whatever field data into stream, most of the time, this is not help in minimising the impact backward compatibility issue. In short, either using Serializable or Externalizable, we still need to proper design our serialized form for future maintenance sake.
No comments:
Post a Comment