Tuesday, March 3, 2015

What’s the deal with serialization?

Today we’re going to talk about a basic, (but from time to time useful) feature of Java — serialization.
Let’s start with theory. So what is serialization?
To serialize an object means to convert its state to a byte stream so that the byte stream can be reverted back into a copy of the object
We can make instances of a class serializable by implementing either java.io. Serializable or java.io.Externalizable interfaces. And to be honest, it’s not required that class itself will implement one of these interfaces, because it will also work if any of its superclasses implement it.

It’s also worth mentioning there’s process which is called deserialization:
[It] is the process of converting the serialized form of an object back into a copy of the object.

Serializable vs. Externalizable

I mentioned in the first paragraph that we can reach our goal and make our objects serializable by using either Serializable or Externalizable interfaces. What’s the difference between them?

Well, Serializable is a marker interface and all you have to do to make the objects serializable it is to implement it and… that’s all. Everything will work. Without any additional effort and hassle (at least hopefully :)

Externalizable is Serializable subinterface, but it provides methods that has to be implemented and are used in serialization process. Those methods are readExternal(), writeExternal(). They gives us control about the way how serialization and deserialization is done.

Now, armed with this knowledge, we can write some code :)

Let me introduce my Serializer class

To make mine and your lives easier, I’ve created Serializer class which makes tests more readable. The code of the class is as follows:
package com.smalaca.blogging.serializable;

import java.io.*;

public class Serializer {

    public static void serializeSafe(Object serializableObject, String fileName) {
        try {
            serialize(serializableObject, fileName);
        } catch (IOException e) {

        }
    }

    public static void serialize(Object serializableObject, String fileName)
            throws IOException {
        FileOutputStream file = new FileOutputStream(fileName);
        ObjectOutputStream objectOutput = new ObjectOutputStream(file);
        objectOutput.writeObject(serializableObject);
        file.close();
    }

    public static Object deserializeSafe(String fileName) {
        try {
            return deserialize(fileName);
        } catch (IOException | ClassNotFoundException e) {
            return null;
        }
    }

    public static Object deserialize(String fileName)
            throws IOException, ClassNotFoundException {
        FileInputStream file = new FileInputStream(fileName);
        ObjectInputStream objectInput = new ObjectInputStream(file);
        Object deserializedObject = objectInput.readObject();
        objectInput.close();

        return deserializedObject;
    }
}

I believe that in real application each of you (and me as well) would avoid methods like serializeSafe() and deserializeSafe() and would handle thrown Exceptions in a better way. However, this class would be used only in our tests and those test will fail anyway if something goes wrong so in my opinion we can allow ourselves for such a construction.

Serialization of simple object

Ok, so let’s start with something simple. We will create Person class which will pass our test:
public class PersonTest {

    @Test
    public void shouldBeSerializable() throws IOException {
        String fileName = "sebastian.serialization";
        Person sebastian = new Person("Sebastian", "Kraków");

        Serializer.serialize(sebastian, fileName);

        Person deserializedSebastian = (Person) Serializer.deserializeSafe(fileName);

        assertThat(deserializedSebastian.toString(), is(sebastian.toString()));
    }
}

We can start with something like this:
public class Person {
    private final String name;
    private final String city;

    public Person(String name, String city) {
     this.name = name;
     this.city = city;
    }

    @Override
    public String toString() {
     return this.name + " is living in " + this.city;
    }
}

I haven’t implemented any interface yet to show you what will happen when you run this test. Let’s do this.
Our test is, as expected, red and we get an information that java.io.NotSerializableException was thrown.

So far so good. Now we can fix the issue and change one line of our code into:
public class Person implements Serializable

Run your tests once again and now everything is green. Great, we can move on.

Let’s complicate it a little bit

Now it’s time for improvement so maybe we can change the String field city and introduce something better:
public class Address {
    private final String street;
    private final String city;

    public Address(String street, String city) {
     this.street = street;
     this.city = city;
    }

    @Override
    public String toString() {
     return street + ", " + city;
    }
}

The code of our main class is as follows:
public class Person implements Serializable {
    private final String name;
    private final Address address;

    public Person(String name, Address address) {
     this.name = name;
     this.address = address;
    }

    @Override
    public String toString() {
     return this.name + " is living on " + this.address;
    }
}

And what about our test? Probably we will have to change construction of an object as shown below:
@Test
public void shouldBeSerializable() throws IOException {
    String fileName = "sebastian.serialization";
    Person sebastian = new Person("Sebastian", new Address("Floriańska", "Kraków"));

    Serializer.serialize(sebastian, fileName);

    Person deserializedSebastian = (Person) Serializer.deserializeSafe(fileName);
    assertThat(deserializedSebastian.toString(), is(sebastian.toString()));
}

Let’s run it. The test is red and we got once again a well-known exception java.io.NotSerializableException. But this time the reason is our Address class.
Fix is simple and it’s enough to add implementation of Serializable interface to this class, run test once again and enjoy its green state :)

When we really don’t want to store something

Yet, sometimes we don’t want to store whole information about class. What should we do in such case?
Let’s figure it out.

This time we will start with User class:
public class User implements Serializable {
    private final String login;
    private boolean isLogged = false;

    public User(String login) {
     this.login = login;
    }

    public void logIn() {
     isLogged = true;
    }

    public void logOut() {
     isLogged = false;
    }

    @Override
    public String toString() {
     return login + " is logged " + (isLogged ? "" + "in" : "out") + ".";
    }
}

And we can assume that we can serialize only its login. We don’t care about the state of isLogged field and it means that each deserialized object should have it set into false while its default value of it.

Our test can look like this:
@Test
public void shouldBeSerializable() {
    String fileName = "smalaca.serialization";
    User user = new User("smalaca");
    user.logIn();

    Serializer.serializeSafe(user, fileName);
 
    User deseriazliedUser = (User) Serializer.deserializeSafe(fileName);
    assertThat(deseriazliedUser.toString(), is("smalaca is logged out."));
}

After we run the test, we will see that:
java.lang.AssertionError: Expected: is "smalaca is logged out." but: was "smalaca is logged in."

Yeah, we can expect this result.

And in this case word transient comes to rescue. This keyword next to field means nothing more than it shouldn’t be serialized.
Let’s verify it:
private transient boolean isLogged = false;

And what’s the result of your test now? Better, isn’t it? Yes, the result is that test passed.

And what about inheritance

And what can stop us with try to serialize object which class have a parent?
As Charles Kettering said “Our imagination is the only limit” :)

We have the following code:
abstract public class Mammal {
    private final String sex;

    Mammal(String sex) {
     this.sex = sex;
    }

    @Override
    public String toString() {
     return "sex is " + sex + ".";
    }
}
 
public class Human extends Mammal implements Serializable {
    private final String name;

    public Human(String name, String sex) {
     super(sex);
    this.name = name;
    }

    @Override
    public String toString() {
     return "My name is " + name + " and my " + super.toString();
    }
}

And test for it:
@Test
public void shouldBeSerializable() throws IOException, ClassNotFoundException {
    String fileName = "monika.serialization";
    Human monika = new Human("Monika", "female");

    Serializer.serialize(monika, fileName);

    Human deseriazlied = (Human) Serializer.deserialize(fileName);
    assertThat(deseriazlied.toString(), is("My name is Monika and my sex is female."));
}

If we will run this test we will get information that:
java.io.InvalidClassException: com.smalaca.blogging.serializable.inheritance.insub.Human; no valid constructor


Why is that? Well, we have no default constructor for Mammal class and sex field is final and we didn’t get the value for it from the file when object was serialized.
There was no problem with serialization; however when we tried to deserialize object we were not able to create it once again, because of this missing value (and default constructor). Of course we can solve this problem easily and quickly by adding a constructor which will initialize sex field with some default value, but this is not what we want to achieve.
Yet, it’s worth to remember that it’s possible to use serialization if only child class is serializable (and parent has defined default constructor), but we will lose all information about fields defined in the parent.

You can check this by adding default constructor into Mammal class, as shown below:
public Mammal() {
    this("male");
}

and changing our assertion (you can notice that we’ve lost all information about sex of tested object):
assertThat(deseriazlied.toString(), is("My name is Monika and my sex is male."));


So what can we do? We can revert our code to state before introducing default constructor (we don’t need it anyway) and we can change once again two lines in the code.
Add implementation of Serializable to Mammal class:
abstract public class Mammal implements Serializable

and remove it from Human class:
public class Human extends Mammal


Now, let’s run our test. This time everything is as we wanted – the test went through.

We always can try something different

Ok, but there are situation when you want add something from yourself into the serialization process, what then?
As I wrote at the beginning, in such a case we can use Externalizable interface.

Let’s take a look at our User class once again:
public class User {
    private final String login;
    private boolean isLogged = false;

    public User(String login) {
     this.login = login;
    }

    public void logIn() {
     isLogged = true;
    }

    public void logOut() {
     isLogged = false;
    }

    @Override
    public String toString() {
     return login + " is logged " + (isLogged ? "" + "in" : "out") + ".";
    }
}

And now the case is not that we don’t want to serialize information whether user is logged or not. Now we don’t want to make it possible when user is logged in.
Let’s start with tests:
@Test(expected = IOException.class)
public void shouldBeNotAbleToSerializeLoggedUser() throws IOException {
    String fileName = "smalaca.serialization";
    User user = new User("smalaca");
    user.logIn();

    Serializer.serialize(user, fileName);
}

@Test
public void shouldBeSerializable() {
    String fileName = "smalaca.serialization";
    User user = new User("smalaca");
    user.logIn();
    user.logOut();

    Serializer.serializeSafe(user, fileName);

    User deserializedUser = (User) Serializer.deserializeSafe(fileName);

    assertThat(deserializedUser.toString(), is(user.toString()));
}

Of course, if we tried to run those tests both would fail. Why? Because our class is not serializable.

But what would you tell if we tried to fix our broken tests using Externalizable interface which brings two new method that we can use:
public class User implements Externalizable{
    private String login;
    private boolean isLogged = false;

    public User() {}

    public User(String login) {
     this.login = login;
    }

    public void logIn() {
     isLogged = true;
    }

    public void logOut() {
     isLogged = false;
    }

    @Override
    public String toString() {
 return login + " is logged " + (isLogged ? "" + "in" : "out") + ".";
    }

    @Override
    public void writeExternal(ObjectOutput out) throws IOException {
     if (isLogged) {
            throw new IOException("Cannot serialize logged user.");
     }
     out.writeObject(login);
    }

    @Override
    public void readExternal(ObjectInput in) throws IOException, ClassNotFoundException {
     login = (String) in.readObject();
    }
}

What will happen when we run our tests now? They will pass.

In the example I show only manipulation during serializing object, but nothing can stop you from adding additional behavior to readExternal() method. Yet, I believe that you have now enough knowledge to try to do it yourself.

This magic serialVersionUID

Ok, the last thing that left is this magic static final field which is called serialVersionUID. Is it really needed? Do you really need it? And if, then what for?

The value of this field allows you to determine whether serialized object was an instance of the same class that we are using to deserialize it. You don’t have to declare it explicitly, because if you don’t do this, then the value will be calculated by a compiler. Yet, those computations depends on compiler implementation and it can turns out that object serialized in one place cannot be deserializable in the other place. Even if we will use the same code.

Another situation, when you can find this value very useful is when you expect there’s a chance that you will change this class in the future (and as we know we should always expect it always :). Then, when you will change something in class definition and you want to signalize that is no longer valid with the previous version, you can just change the value of serialVersionUID and you will get something like:
java.io.InvalidClassException: com.smalaca.blogging.serializable.withserialversionuid.Person; local class incompatible: stream classdesc serialVersionUID = 132, local class serialVersionUID = 12421


If it’s not enough and you want to read more about serialVersionUID, you should look at Oracle documentation.



I’ve hope you liked this article and you will soon be back for more:)



No comments:

Post a Comment