Loading...

Type: Enhancement
Resolution: Fixed
Priority: P4
Fix Version/s: 1.4.0
Affects Version/s: 1.4.0
Component/s: core-libs
Labels:
- beta2
- data
- evolution
- optional
- serialization

Subcomponent:
java.io:serialization
Resolved In Build:
beta2
CPU:

x86
OS:

windows_nt
Verification:
Verified

Name: vuR10080 Date: 08/08/2000

###@###.### 2000-08-02

This RFE is filed to address problems in serialization uncovered by
4352819, "java.awt.ScrollPane deserialization problems".

Let's consider three versions of a sample class. We are interested in
evolution paths A->C and B->C.

        // version A
        class Test implements Serializable {
            static final long serialVersionUID = 123456789L;

            // no readObject/writeObject methods
        }

        // version B
        class Test implements Serializable {
            static final long serialVersionUID = 123456789L;

            private void writeObject(ObjectOutputStream s)
                throws IOException
            {
                s.defaultWriteObject();
            }

            private void readObject(ObjectInputStream s)
                throws ClassNotFoundException, IOException
            {
                s.defaultReadObject();
            }
        }

        // version C (evolved)
        class Test implements Serializable {
            static final long serialVersionUID = 123456789L;

            private void writeObject(ObjectOutputStream s)
                throws IOException
            {
                s.defaultWriteObject();
                // write optional stuff
            }

            private void readObject(ObjectInputStream s)
                throws ClassNotFoundException, IOException
            {
                s.defaultReadObject();
                // read stuff (handling OptionalDataException)
            }
        }

Let's consider grammar of serialized data stream as defined in
<docs/guide/serialization/spec/protocol.doc5.html>. Relevant
productions, abbridged and with several nonterminals expanded, are:

newObject:
    TC_OBJECT classDesc newHandle classdata[] // data for each class

classdata:
   values // SC_SERIALIZABLE & classDescFlag
                             // && !(SC_WRITE_METHOD & classDescFlags)

   values objectAnnotation // SC_SERIALIZABLE & classDescFlag
                             // && SC_WRITE_METHOD & classDescFlags

objectAnnotation:
       endBlockData
       contents endBlockData // contents written by writeObject
                                 // or writeExternal PROTOCOL_VERSION_2.

Now let's consider implications of these grammar for evolutions A->C
and B->C of our sample class.

Since A doesn't have writeObject method the first production for
{classdata} will be used for A. Thus for A only it's {values} are
written to the stream and no {objectAnnotation} is written - so data
for A are immediately followed in the stream by data for the next
object (e.g. data for subclass, for next field or for next array
item) or some primitive data.

As project evolves a need arises to write optional data for the sample
class, i.e. evolution A->C happens. The class now has writeObject and
readObject methods. It seems natural to implement readObject as
follows:

    private void readObject(ObjectInputStream s)
        throws ClassNotFoundException, IOException
    {
// read default fields
        s.defaultReadObject();

// try to read optional data (introduced in version C)
try {
... = s.readObject();
// ...
}
catch (OptionalDataException e) {
if (!e.eof) {
// something bad happened, notify caller
throw e;
}
// else: we just hit the end of optional data,
// so we're done with the stream and can do further
// processing of data just read
}

// do necessary setup if any
    }

And here developer is unpleasantly surprised. When version C reads
data serialized by version A, the call to s.readObject() in the try
block will happily consume whatever data are present in the stream
after the data for version A, the data totally unrelated to A, but
closely matches what it expects. In the case of 4352819 it was the
next element of the Container.components[] array that the ScrollPane
was a child of.

The problem is that data for A doesn't have any terminator (as
explained above) and so a call to s.readObject() cannot tell if it's
reading optional data for A or some unrelated data that happend to
follow A in the stream.

OTOH, let's consider evolution B->C. In case of B, since version B
has a writeObject method, {values} for B in the stream will be
followed by an {objectAnnotation} - for the trivial writeObject of B
that {objectAnnotation} will be simply {endBlockData} marker. So if C
was presented with data serialized for B the call to s.readObject in
try block would detect the {endBlockData} marker and report the
OptionalDataException with eof == true and C would be able to handle
lack of optional data gracefully.

We can say that evolution B->C is automatically backward compatible.
OTOH, evolution A->C is not and C should take special care to handle
streams serialized for A.

Of course developers don't sprinkle those trivial writeObject methods
in every Serializable class in anticipation of evolution to C, that is
evolution A->C is much much more likely to happen in practice than
evolution B->C. For obvious ease-of-use reasons the most frequent
case should be the best supported by the API, but is it so in this
case? Let's consider what version C needs to do to support A->C.

As was shown above, s.readObject doesn't throw OptionalDataException in
this situation.

Can we detect that class data were serialized without {objectAnnotation}?
No, we can't as we don't have access to the class descriptor for the
version of the class *in the stream*. At best we can use
ObjectStreamClass.forClass API to obtain "the class in the *local* VM
that this version is mapped to" (emphasis mine).

It seems that the only way to handle A->C is to use a trick that
duplicates !(SC_WRITE_METHOD & classDescFlags) information in the
evolved class by using an extra field that is added to the class at
the same time writeObject is added. Since this new field and the
writeObject method are tied in this way, we will be able to use
GetField API to check if the new field was ObjectStreamField.defaulted
and if it was we will know that the lack of that field in serial data
implies (by our arrangement) the lack of writeObject method in the
version of the class that wrote these serial data and thus we will
know that reading any optional data is unsafe (because there's no
{objectAnnotation} in the stream).

Of course we don't have to use dedicated field for the marker.
E.g. in the case of 4352819 a new field wheelScrollingEnabled was
added to ScrollPane anyway, so it could be used as a marker.

Thus evolved version (let's call it C-prim) should have the following
readObject method to handle evolution from A.

    // Marker field introduced in the same version the writeObject is
    // introduced. We don't have to use a dedicated field if any
    // "real" fields are added to the class.
    private bool hasOptionalData = true;

    private void readObject(ObjectInputStream s)
        throws ClassNotFoundException, IOException
    {
// read default fields; can't use s.defaultReadObject,
// have to use GetField API to be able to check for 'defaulted'.
ObjectInputStream.GetField f = s.readFields();

// assign fields manually; this is boring and error-prone it
// shouldn't be like that but defaultReadObject and GetField
// APIs are all-or-nothing choice.
someField = f.get("someField", null);
// ... etc, ad nauseum

if (f.defaulted("hasOptionalData")) {
// stream written by A, no optional data, any further
// reading from the stream will be a disaster
}
else{
// try to read optional data (version C and later)
try {
... = s.readObject();
// ...
}
catch (OptionalDataException e) {
if (!e.eof) {
// something bad happened, notify caller
throw e;
}
// else: we just hit the end of optional data,
// so we're done with the stream and can do further
// processing of data just read
}
}

// do necessary setup if any
    }

As you can see, this approach, while works, place a burden of using
GetField API (very inconvenient in this case).

It is proposed that an API is added to ObjectInputStream to query the
the stream for the presence of {objectAnnotation}. The stream already
have this information, since it has read the {classDesc} by the time
the {classdata} is being read. I'm not sure in which particular form
this data can be made available - this is left up to your expert
judgement, however let me present few suggestions.

The best solution, if possible, is for s.defaultReadObject() and
s.readFields() to set a flag that {values} for the current class have
been read from the stream. This flag will be reset when
invokeObjectReader returns. If this flag is set and s.readObject() or
other read* method is invoked, the stream can check {classDescFlags}
written to the stream and, if SC_WRITE_METHOD is not present in
{classDescFlags}, throw an OptionalDataException with eof == true.

This approach will make evolution A->C automatically compatible just
like B->C is. This this is the best solution for the ease-of-use and
least-surprise point of view.

For API completeness this information could be as well exported in the
form of new API calls on ObjectInputStream and/or ObjectStreamClass.
E.g

    public class ObjectInputStream {

/**
* Returns class descriptor flags for the current class.
* @see ObjectStreamConstants
*/
        public int getStreamClassFlags();

    }

Here you are in better position to design the particular form of these
new APIs.

======================================================================

relates to

JDK-4352819 java.awt.ScrollPane deserialization problems

Closed

JDK-4388704 end of custom data read behavior inconsistent and unclearly specified

Closed

Details

Description

Attachments

Issue Links

Activity

People

Dates