Notes of Effective Java 10, Serialization
- 74. implement Serializable judiciously
- 75. Consider using a custom serialized form
- 76. Write
- 77. For instance control, prefer enum types to
- 78. Consider serialization proxies instead of serialized instances
74. implement Serializable judiciously
Serializable interface is published, it decreases the flexibility to change a class's implementation. You have to support serialization forever.
Serialization and deserialization are normally implemented by
You should design a high quality serialization method, that can be used for a long time.
Every serializable class has a unique ID
private static final long serialVersionUID. If you don't have one, system will create a UID based on the structure of this class.
Another cost of implementing
Serializable is that it increases the likelihood of bugs and security holes. Usually we create an object using constructor. Serialization provides another extralinguistic mechanism to create an object. Deserialization is a hidden constructor. Deserialization constructor must follow all the constraints in the real constructor, otherwise it will be attacked.
The third cost of serialization is, as new version releasing, it increases the testing burden. When new version released, we need to check whether can create an object using new version, and deserialize in the old version.
If a class is serializable, then all its component classes need to be serializable.
A class designed for inheritance should try best to avoid implementing serializable.
Examples of implementing
Serializable, therefore RMI exception can be sent from server to client.
For a class that can be serialized and the initial values of instance field are special, then we need to add a
readObjectNoData to the class. For more information, please check Java documentation about Serialization.
Serialization requires the implementing class to have all the field serializable, which makes a class designed for inheritance difficult to be serializable. In Addition, if a parent class does not have a parameterless constructor, the subclass is surely not serializable. Therefore, to make a class designed for inheritance nonserializable, you should have a parameterless constructor.
If you want to have a nonserializable parent class, and a serializable subclass, you need to make the parent class a protected parameterless constructor.
Anyway, you have to cautiously make a decision whether subclass is serializable.
75. Consider using a custom serialized form
The default serialized form is likely to be appropriate if an object's physical representation is identical to its logical content. For example, some classes that contain only property fields.
Using the default serialized form when an object's physical representation differs substantially from its logical data content has four disadvantages:
- It permanently ties the exported API to the current internal representation. Internal class, field becomes part of the public API. If the internal is changed in the future, the actual class still need to support the old version of internal.
- It can consume excessive space. Redundant information will also be serialized.
- It can consume excessive time. Serialization traverses the topology graph of class relationship.
- It can case stack overflows, due to the traversal of topology graph.
StringList is a class store a list of
Strings. The logical representation should only contains all the elements, and maybe the number of elements. But the physical representation is the list itself, contains the linkage between elements.
transient field means to omit it from the default serialization.
defaultWriteObject writes the non-static and non-transient fields of the current class to this stream. It may only be called from the
writeObject method of the class being serialized.
It is recommended to call
defaultReadObject, even when all the fields are transient. It improves the flexibility. In the future if we add non-transient field to the class, the serialization will still be successful.
writeObject has commented documentation, because it defines the serialization form.
@serialData tells Javadoc to include this part as serialization information.
Mark fields as
transient when needed.
If you are using the default serialization form, when deserializing, all the transient field will have be assigned the default value:
null for objects,
0 for number, and
false for boolean, etc..
private static final long serialVersionUID is to avoid the incompatibility of default UID between different versions.
All of our effort here is to resolve the serialization compatibility problem.
readObject methods defensively
readObject method is another public constructor. It must validate the parameters before deserialization, and also do defensive copying. Otherwise, attacker will create an illegal object from it.
Attacker can manually fake a serialized form, by modifying a normal one based on some documentation.
To solve the problem, you can provide a
readObject method for
It still has a small but severe problem. Attacker can create a class to modify a valid
This demo show that a
Period object internal can be changed by
MutablePeriod object. If attacker creates a
MutablePeriod object, and pass the
mp.period to your program, and the security of your program depends on the immutability of
Period, then you will be hacked.
The problem here is
Period does not provide enough defensive copying.
in.readObject() returns the reference of internal.
Add the code to
To provide defensive copying of final field, we have to remove
readObject method must implement all the validation that constructor does.
There are some rules for producing a more robust
- For classes with object reference fields that must remain private, defensively copy each object in such a field.
- Check any invariants and throw an
InvalidObjectExceptionif a check fails. The checks should follow any defensive copying.
- If an entire object graph must be validated after it is deserialized, use the
- Do not invoke any overrideable methods in the class, directly or indirectly.
77. For instance control, prefer enum types to
If we deserialize a Singleton instance, it will no longer be a Singleton, since deserialization creates another one.
readResolve allows you using
readObject to create an object to replace another one.
This program output:
read resolve called changed changed
Which means the deserialized object is actually the original one. Please compare the result after removing
readResolve will return the original INSTANCE. If the object has reference field, they all should declared as
78. Consider serialization proxies instead of serialized instances
Serialization Proxy Pattern means, provide a private nested static class for a serializable class. This class is serialization proxy, it has a single constructor, and parameter is its enclosing class object.
Example: Period class
Then, add these two methods to
When client serialize a
Program goes into the
Period. It actually serializes a new
SerializationProxy object. But the client doesn't know this. This new
SerializationProxy object contains all the information of the
When client deserialize the
Period object, which actually is
ois contains the
SerializationProxy object. Program goes into the
SerializationProxy object, which returns a copy of original
Now, client cannot call the
Serialization Proxy Pattern ensures
Period is really immutable, the fields are