Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4360508

Better serialization support for class evolution that introduces optional data

XMLWordPrintable

    • beta2
    • x86
    • windows_nt
    • Verified



      Name: vuR10080 Date: 08/08/2000

      ###@###.### 2000-08-02

      This RFE is filed to address problems in serialization uncovered by
      4352819, "java.awt.ScrollPane deserialization problems".

      Let's consider three versions of a sample class. We are interested in
      evolution paths A->C and B->C.

              // version A
              class Test implements Serializable {
                  static final long serialVersionUID = 123456789L;

                  // no readObject/writeObject methods
              }


              // version B
              class Test implements Serializable {
                  static final long serialVersionUID = 123456789L;

                  private void writeObject(ObjectOutputStream s)
                      throws IOException
                  {
                      s.defaultWriteObject();
                  }

                  private void readObject(ObjectInputStream s)
                      throws ClassNotFoundException, IOException
                  {
                      s.defaultReadObject();
                  }
              }


              // version C (evolved)
              class Test implements Serializable {
                  static final long serialVersionUID = 123456789L;

                  private void writeObject(ObjectOutputStream s)
                      throws IOException
                  {
                      s.defaultWriteObject();
                      // write optional stuff
                  }

                  private void readObject(ObjectInputStream s)
                      throws ClassNotFoundException, IOException
                  {
                      s.defaultReadObject();
                      // read stuff (handling OptionalDataException)
                  }
              }

      Let's consider grammar of serialized data stream as defined in
      <docs/guide/serialization/spec/protocol.doc5.html>. Relevant
      productions, abbridged and with several nonterminals expanded, are:

      newObject:
          TC_OBJECT classDesc newHandle classdata[] // data for each class

      classdata:
         values // SC_SERIALIZABLE & classDescFlag
                                   // && !(SC_WRITE_METHOD & classDescFlags)

         values objectAnnotation // SC_SERIALIZABLE & classDescFlag
                                   // && SC_WRITE_METHOD & classDescFlags

      objectAnnotation:
             endBlockData
             contents endBlockData // contents written by writeObject
                                       // or writeExternal PROTOCOL_VERSION_2.


      Now let's consider implications of these grammar for evolutions A->C
      and B->C of our sample class.

      Since A doesn't have writeObject method the first production for
      {classdata} will be used for A. Thus for A only it's {values} are
      written to the stream and no {objectAnnotation} is written - so data
      for A are immediately followed in the stream by data for the next
      object (e.g. data for subclass, for next field or for next array
      item) or some primitive data.

      As project evolves a need arises to write optional data for the sample
      class, i.e. evolution A->C happens. The class now has writeObject and
      readObject methods. It seems natural to implement readObject as
      follows:


          private void readObject(ObjectInputStream s)
              throws ClassNotFoundException, IOException
          {
      // read default fields
              s.defaultReadObject();

      // try to read optional data (introduced in version C)
      try {
      ... = s.readObject();
      // ...
      }
      catch (OptionalDataException e) {
      if (!e.eof) {
      // something bad happened, notify caller
      throw e;
      }
      // else: we just hit the end of optional data,
      // so we're done with the stream and can do further
      // processing of data just read
      }

      // do necessary setup if any
          }

      And here developer is unpleasantly surprised. When version C reads
      data serialized by version A, the call to s.readObject() in the try
      block will happily consume whatever data are present in the stream
      after the data for version A, the data totally unrelated to A, but
      closely matches what it expects. In the case of 4352819 it was the
      next element of the Container.components[] array that the ScrollPane
      was a child of.

      The problem is that data for A doesn't have any terminator (as
      explained above) and so a call to s.readObject() cannot tell if it's
      reading optional data for A or some unrelated data that happend to
      follow A in the stream.

      OTOH, let's consider evolution B->C. In case of B, since version B
      has a writeObject method, {values} for B in the stream will be
      followed by an {objectAnnotation} - for the trivial writeObject of B
      that {objectAnnotation} will be simply {endBlockData} marker. So if C
      was presented with data serialized for B the call to s.readObject in
      try block would detect the {endBlockData} marker and report the
      OptionalDataException with eof == true and C would be able to handle
      lack of optional data gracefully.

      We can say that evolution B->C is automatically backward compatible.
      OTOH, evolution A->C is not and C should take special care to handle
      streams serialized for A.

      Of course developers don't sprinkle those trivial writeObject methods
      in every Serializable class in anticipation of evolution to C, that is
      evolution A->C is much much more likely to happen in practice than
      evolution B->C. For obvious ease-of-use reasons the most frequent
      case should be the best supported by the API, but is it so in this
      case? Let's consider what version C needs to do to support A->C.

      As was shown above, s.readObject doesn't throw OptionalDataException in
      this situation.

      Can we detect that class data were serialized without {objectAnnotation}?
      No, we can't as we don't have access to the class descriptor for the
      version of the class *in the stream*. At best we can use
      ObjectStreamClass.forClass API to obtain "the class in the *local* VM
      that this version is mapped to" (emphasis mine).

      It seems that the only way to handle A->C is to use a trick that
      duplicates !(SC_WRITE_METHOD & classDescFlags) information in the
      evolved class by using an extra field that is added to the class at
      the same time writeObject is added. Since this new field and the
      writeObject method are tied in this way, we will be able to use
      GetField API to check if the new field was ObjectStreamField.defaulted
      and if it was we will know that the lack of that field in serial data
      implies (by our arrangement) the lack of writeObject method in the
      version of the class that wrote these serial data and thus we will
      know that reading any optional data is unsafe (because there's no
      {objectAnnotation} in the stream).

      Of course we don't have to use dedicated field for the marker.
      E.g. in the case of 4352819 a new field wheelScrollingEnabled was
      added to ScrollPane anyway, so it could be used as a marker.

      Thus evolved version (let's call it C-prim) should have the following
      readObject method to handle evolution from A.


          // Marker field introduced in the same version the writeObject is
          // introduced. We don't have to use a dedicated field if any
          // "real" fields are added to the class.
          private bool hasOptionalData = true;

          private void readObject(ObjectInputStream s)
              throws ClassNotFoundException, IOException
          {
      // read default fields; can't use s.defaultReadObject,
      // have to use GetField API to be able to check for 'defaulted'.
      ObjectInputStream.GetField f = s.readFields();

      // assign fields manually; this is boring and error-prone it
      // shouldn't be like that but defaultReadObject and GetField
      // APIs are all-or-nothing choice.
      someField = f.get("someField", null);
      // ... etc, ad nauseum

      if (f.defaulted("hasOptionalData")) {
      // stream written by A, no optional data, any further
      // reading from the stream will be a disaster
      }
      else{
      // try to read optional data (version C and later)
      try {
      ... = s.readObject();
      // ...
      }
      catch (OptionalDataException e) {
      if (!e.eof) {
      // something bad happened, notify caller
      throw e;
      }
      // else: we just hit the end of optional data,
      // so we're done with the stream and can do further
      // processing of data just read
      }
      }

      // do necessary setup if any
          }

      As you can see, this approach, while works, place a burden of using
      GetField API (very inconvenient in this case).


      It is proposed that an API is added to ObjectInputStream to query the
      the stream for the presence of {objectAnnotation}. The stream already
      have this information, since it has read the {classDesc} by the time
      the {classdata} is being read. I'm not sure in which particular form
      this data can be made available - this is left up to your expert
      judgement, however let me present few suggestions.

      The best solution, if possible, is for s.defaultReadObject() and
      s.readFields() to set a flag that {values} for the current class have
      been read from the stream. This flag will be reset when
      invokeObjectReader returns. If this flag is set and s.readObject() or
      other read* method is invoked, the stream can check {classDescFlags}
      written to the stream and, if SC_WRITE_METHOD is not present in
      {classDescFlags}, throw an OptionalDataException with eof == true.


      This approach will make evolution A->C automatically compatible just
      like B->C is. This this is the best solution for the ease-of-use and
      least-surprise point of view.


      For API completeness this information could be as well exported in the
      form of new API calls on ObjectInputStream and/or ObjectStreamClass.
      E.g

          public class ObjectInputStream {

      /**
      * Returns class descriptor flags for the current class.
      * @see ObjectStreamConstants
      */
              public int getStreamClassFlags();

          }

      Here you are in better position to design the particular form of these
      new APIs.

      ======================================================================

            mwarressunw Michael Warres (Inactive)
            uwesunw Uwe Uwe (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: