More on Serialization

by Prasanth Gullapalli

Serialization is all about saving the state of a given object and reconstructing the object back when it is needed. As it is all about saving state of an instance, it is clear that none of the static variables would get serialized as part of the process. Also when we want few variables defined in an object not to be serialized, we mark them as transient. For ex: we don’t want to serialize a database connection.

Serialization makes a deep copy of an object. In order for a class to participate in Serialization, it has to implement one of the interfaces: Serializable or Externalizable. So to understand when to choose which interface, we need to understand the differences between them.

A very fundamental difference is that Serializable is a marker interface where as Externalizable has 2 methods defined in it. Here is the API of Externalizable:

public interface Externalizable extends java.io.Serializable {
    void writeExternal(ObjectOutput out) throws IOException;
    void readExternal(ObjectInput in) throws IOException, ClassNotFoundException;
}

Notice that Externalizable extends Serializable. When a class is marked as Serializable, Java itself takes care of serialization process. But in case of Externalizable, it enables you to define custom rules and your own mechanism for serialization. When a class implements Serializable and we got a subclass defined for it, we need not write any specific code in subclass for it to be serialized properly. But in case of Externalizable, we need to override the methods from the interface and provide the specific behavior for subclass. Not only when a new subclass is defined, also whenever we make changes to the state of original class, we have to see if it affects the serialization behavior and make the changes accordingly in case of Externalizable.

It is beneficial to choose Externalizable when we have a super class and a subclass, and we want only our subclasses properties to be serialized. In case of Serializable, it is not possible to serialize subclass properties alone as by default JVM serializes super class properties too. As we don’t have control over the Serialization mechanism, we can’t alter this behavior. Suppose we have a class which is not serializable and it has got subclass which is Serializable.

public class SuperClass {
	private String propertyA;
	private String propertyB;

	// Getters and Setters
}

public class Subclass extends SuperClass implements Serializable {
	private String propertyC;

	//Getters and Setters
}

public class SerializationUtil {
	public static Object deserialize(byte[] byteArray)
	throws IOException, ClassNotFoundException {
		ByteArrayInputStream bis = new ByteArrayInputStream(byteArray);
		ObjectInputStream ois = new ObjectInputStream(bis);
		return ois.readObject();
	}

	public static ByteArrayOutputStream serialize(Object object)
	throws IOException {
		ByteArrayOutputStream bos = new ByteArrayOutputStream();
		ObjectOutputStream oos = new ObjectOutputStream(bos);
		oos.writeObject(object);
		return bos;
	}
}

public class SerializationTest {
	public static void main(String[] args) throws FileNotFoundException, IOException, ClassNotFoundException {
		Subclass subclass = new Subclass();
		subclass.setPropertyA("A");
		subclass.setPropertyB("B");
		subclass.setPropertyC("C");

		ByteArrayOutputStream bos = SerializationUtil.serialize(subclass);
		System.out.println("Subclass: "+subclass);

		subclass = (Subclass)SerializationUtil.deserialize(bos.toByteArray());
		System.out.println("Subclass: "+subclass);
	}
}

Now the output is as expected:
Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}
Subclass: Subclass{propertyA=null, propertyB=null, propertyC=C}

What happens behind the scenes is JVM first creates an instance of Subclass(but without calling the constructor). It reads the class information from stream and creates an instance. Also before deserializing the properties of subclass, JVM sees if the super class is Serializable and deserializes it properties before going ahead with subclass. Suppose if SuperClass is not Serializable, it checks if it has got a default constructor and calls it to initialize it’s properties with default values.

In the above example, properties of SuperClass didn’t get deserialized as it it is not Serializable. But note that the SuperClass constructor is called as part of Serialization mechanism. As the default constructor does not initialize any of properties of the class, we see that all values are set to null. So now can I make sure when I deserialize subclass instance, all the properties of SuperClass should also be set? Here comes readObject and writeobject of Serialization process. Let’s say we redefine the subclass:

public class Subclass extends SuperClass implements Serializable {
	private String propertyC;

	public String getPropertyC() {
		return propertyC;
	}
	public void setPropertyC(String propertyC) {
		this.propertyC = propertyC;
	}

	private void readObject(ObjectInputStream stream)
	throws IOException, ClassNotFoundException{
		stream.defaultReadObject();
		setPropertyA(stream.readObject().toString());
		setPropertyB(stream.readObject().toString());
	}

	private void writeObject(ObjectOutputStream stream)
	throws IOException{
		stream.defaultWriteObject();
		stream.writeObject(getPropertyA());
		stream.writeObject(getPropertyB());
	}

	@Override
	public String toString() {
		return Objects.toStringHelper(this).
		add("propertyA", getPropertyA()).
		add("propertyB", getPropertyB()).
		add("propertyC", getPropertyC()).
		toString();
	}
}

During the process of serialization, JVM automatically checks if writeObject is declared in your class and calls this method. And same is the case with readObject during the process of deserialization. Note that the JVM can call private methods of your class whenever it wants but no other objects can. As part of the code above, we are requesting JVM to carry on with its default serialization process(stream.defaultReadObject()). Once it is done with the default serialization, we are writing our own properties to the stream. And also when deserializing, after the default process, we are manually reading back the properties we have written to the stream. After this change, the output would be:
Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}
Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}

Suppose the default constructor is not present in the super class the it throws an error saying the subclass in not Serializable. So it is must that for a subclass to be Serializable, the super class must have a default constructor.

public class SuperClass {
	private String propertyA;
	private String propertyB;

	public SuperClass(String a, String b){
		this.propertyA = a;
		this.propertyB = b;
	}

	// Getters and Setters
}

public class Subclass extends SuperClass implements Serializable {
	private String propertyC;

	public Subclass(String a, String b, String c){
		super(a, b);
		this.propertyC= c;
	}

	//Getters and Setters
}

public class SerializationTest {
	public static void main(String[] args) throws FileNotFoundException, IOException, ClassNotFoundException {
		Subclass subclass = new Subclass("A", "B", "C");

		ByteArrayOutputStream bos = SerializationUtil.serialize(subclass);
		System.out.println("Subclass: "+subclass);

		subclass = (Subclass)SerializationUtil.deserialize(bos.toByteArray());
		System.out.println("Subclass: "+subclass);
	}
}

On trying to serialize subclass we get an exception saying:
Exception in thread “main” java.io.InvalidClassException: com.pramati.model.Subclass; com.pramati.model.Subclass; no valid constructor

So how to resolve this problem? We can not use Externalizabe interface to resolve this as Subclass itself does not have a default constructor. For externalization to work, a class must have a visible default constructor. Suppose we modify our Subclass to have a default constructor:

public class Subclass extends SuperClass implements Serializable, Externalizable {
	private String propertyC;
	public Subclass() {
		this("", "", "");
	}
	public Subclass(String a, String b, String c){
		super(a, b);
		this.propertyC= c;
	}

	@Override
	public void readExternal(ObjectInput in) throws IOException,
			ClassNotFoundException {
		setPropertyA(in.readObject().toString());
		setPropertyB(in.readObject().toString());
		setPropertyC(in.readObject().toString());
	}

	@Override
	public void writeExternal(ObjectOutput out) throws IOException {
		out.writeObject(getPropertyA());
		out.writeObject(getPropertyB());
		out.writeObject(getPropertyC());
	}
}

Now if we try to serialize(version 2 of SerializationTest), it works perfectly fine and gives the result.
Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}
Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}

Suppose we don’t want to do this hack of providing a dummy default constructor for Subclass, how to resolve this problem? Here comes readResolve and writeReplace methods of Serialization into rescue. Modify the subclass as follows :

public class Subclass extends SuperClass implements Serializable {
	private String propertyC;
	public Subclass(String a, String b, String c){
		super(a, b);
		this.propertyC= c;
	}

	public Subclass(PropertyHolder holder) {
		this(holder.a, holder.b, holder.c);
	}

	public String getPropertyC() {
		return propertyC;
	}
	public void setPropertyC(String propertyC) {
		this.propertyC = propertyC;
	}

	private Object writeReplace() throws ObjectStreamException{
		return new PropertyHolder(this);
	}

	@Override
	public String toString() {
		return Objects.toStringHelper(this).
		add("propertyA", getPropertyA()).
		add("propertyB", getPropertyB()).
		add("propertyC", getPropertyC()).
		toString();
	}
}
public class PropertyHolder implements Serializable{
	public String a;
	public String b;
	public String c;

	public PropertyHolder(Subclass sub) {
		a = sub.getPropertyA();
		b = sub.getPropertyB();
		c = sub.getPropertyC();
	}

	private Object readResolve() throws ObjectStreamException{
		return new Subclass(this);
	}
}

Each time at the end of deserialization, readResolve method get called to give a chance to replace the deserialized object with a new instance. And writeReplace conversely gets called before the process of serialization and replaces the object to be serialized. After using these methods, the output would be as expected:

Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}
Subclass: Subclass{propertyA=A, propertyB=B, propertyC=C}

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s