Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6431636

(coll) New methods for handling iterable sequences in Collection framework

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • None
    • 5.0
    • core-libs

      A DESCRIPTION OF THE REQUEST :
      Proposal
      =======

      I propose adding to interface 'java.util.Collection'
      the following methods for handling iterable sequences:

        a: Methods receiving instances of interface 'java.util.Iterable':

      boolean addAll(Iterable<? extends T> iterable)
      boolean containsAll(Iterable<? extends T> iterable)
      boolean removeAll(Iterable<? extends T> iterable)
      boolean retainAll(Iterable<? extends T> iterable)

        b: Methods receiving instances of interface 'java.util.Iterator':

      boolean addAll(Iterator<? extends T> iterator)
      boolean containsAll(Iterator<? extends T> iterator)
      boolean removeAll(Iterator<? extends T> iterator)
      boolean retainAll(Iterator<? extends T> iterator)

      Especially, the interface 'java.util.List' should receive
      the following additional methods:

        a: Methods receiving instances of interface 'java.util.Iterable':

          boolean addAll(int index, Iterable<? extends T> iterable)

        b: Methods receiving instances of interface 'java.util.Iterator':

          boolean addAll(int index, Iterator<? extends T> iterator)

      For the concrete collections within the Collections framework
      (LinkedList, HashSet, etc.) I further suggest adding the
      following constructors:

        a: Constructors receiving instances of interface 'java.util.Iterable':

          public ConcreteCollection(Iterable<? extends T> iterable)
          
        b: Constructors receiving instances of interface 'java.util.Iterator':
          
          public ConcreteCollection(Iterator<? extends T> iterator)

      The semantics of those methods/constructors would be analoge to the
      semantics of the according existing methods/constructors,
      which receive a 'Collection<? extends T>' as their argument:
      Instead of a collection, which itself is an iterable sequence
      (Collection extends Iterable), the methods would receive
      an iterable sequence as an object or represented by an iterator.

      Implementation
      ============

      The implementation of the new methods/constructors
      will be possible in a rather straightforward form.
      As an example, here is code for 'ArrayList'.

        public LinkedList(Iterable<? extends T> iterable) {
          this(iterable.iterator());
        }

        public LinkedList(Iterator<? extends T> iterator) {
          super();
          this.addAll(iterator);
        }

        public void addAll(Iterable<? extends T> iterable) {
          this.addAll(iterable.iterator());
        }

        public void addAll(Iterator<? extends T> iterator) {
          while (iterator.hasNext()) {
              this.add(iterator.next());
          }
        }
        
      It should be mentioned that the last of those methods can be implemented
      in a more efficient way by directly manipulating the internal
      representation of the list. See the implementation of
      'java.util.LinkedList(int, Collection<? extends T>)' !

      After adding the proposed methods, one can change the
      implementation of the 'Constructor<...>' versions of the methods
      to become simple delegates to the 'Iterable<...>' versions. This
      would avoid duplication of similar code. For instance:

        public void addAll(Collection<? extends T> c) {
          this.addAll( (Iterable)c );
        }


      JUSTIFICATION :
      Reasoning
      =========

      a: Reasoning for the 'Iterable<...>' methods/constructors
      -----------------------------------------------------------

      There is some need for creating new collections from existing
      collections of objects, or to merge two existing collections.
      In Java 1.5 (and in the upcoming 1.6, too), we only have a constructor
      for creating e.g. a new List from a j.u.Collection.
      Until 1.4 this was already a slight problem, in that there were already
      other ways to represent collections of objects: Not just by "official"
      Java CollectionS, but also by IteratorS. But with the introduction
      of the Iterable interface in Java 1.5, together with the
      addition of the for-each loop, the definition of iterators for
      user defined classes will presumably become a common task in Java.
      So this should also be recognized in the definition of the standard APIs.
      Currently I have to define utility functions for this purpose in all
      my projects. While this is very easy to do (see section "Implementation"),
      I feel this is not the right thing to do -- it should be there out of the box.

      b: Reasoning for the 'Iterator<...>' methods/constructors
      -----------------------------------------------------------

      While discussing this new methods in the Java forum,
      most discussion was on the question whether to add
      the 'Iterator<...>' methods or not.
      They would indeed not be necessary, if every class,
      for which an iterator exists, would implement
      the 'Iterator' interface. This makes sense for newly defined classes,
      but there are several existing classes, which do not currently
      implement 'Iterable', and some of those classes won't be
      able to do this later on.

      For example, the "Jena Semantic Web" framework, hosted on
      <jena.sourceforge.net>, has a key concept called "Model",
      which actually represents an RDF graph. You can query such a Model
      for RDF triples by the model's 'list' methods, which will return you
      an 'Iterator' representing the result set of the query.
      Because Jena is a pre-J1.5 development, the Model interface
      does not extend the Iterator interface. And because Jena is the base
      for a lot of existing code, this won't change that fast.

      On the other hand, one could argue that it is not necessary
      to support old code, when creating new features within a
      new framework. I admit to have some sympathy with this thought,
      so I would like to leave it open to further discussion, whether
      to add the 'Iterator<...>' methods or not.

      Compatibility
      ==========

      This new feature would break no existing code, because it only adds
      new methods and constructors to the existing framework. Besides that,
      it would be a rather conservative feature:
      The 'Iterable<...>' methods are just an extension
      to the existing 'Collection<...>' methods,
      and the 'Iterator<...>' methods are _pragmatically_ analoge
      to the 'Iterable<...>' methods.

      While it seems somewhat more resricted to think of a _sequence_ of objects
      instead of a _collection_ of objects, this is defacto not a restriction.
      The Javadoc specification of 'Collection.addAll' does not say anything
      about the order in which the collection's entries are read, so in this
      case, every order is welcome. And whenever, like for List.addAll',
      an order is defined, it is defined in terms of the iterator returned by the
      'iterator()' method, which itself implements the 'Iterable' interface.
      So again no problem. Another point to consider is the 'Set' interface:
      Here each entry of a sequence must only be inserted once, if it is not
      already in. But this is also the definition for the
      current 'Collection<...>' version of 'addAll', so the 'Iterable<...>'
      and 'Iterator<...>' versions can be handled in the same way
      with no problems.
       
      It should be mentioned that there will be no clashes between
      the current 'Collection<...>' methods and the new 'Iterable<...>' methods;
      both versions of those methods can coexist.
      When given a real 'Collection' instance, the 'Collection<...>' version
      of the interface will be chosen by the compiler, so there will be
      no danger of a hidden change in semantics after recompiling against
      the new interface, and it even does not lose any performance.
      The submitter provided the following SDN comment:

      > Having methods that take Collections instead of Iterables does allow
      > optimizations based on calling size().

      I do not want to _replace_ the methods getting a Collection. As I
      already pointed out in the RFE, it is perfectly possible for them
      to coexist with the new methods, and so you can put all kinds of
      optimized handling into them. But even in case of replacement,
      you could do a simple runtime type check (by means of instanceof)
      to handle CollectionS specially.


      Comment added by :
      > It might have been better to use the more general signatures
      > suggested by the submitter, but interfaces can never be changed compatibly

      This argument has several facets, some of them have already been
      discussed in the forum.

      First, adding _Constructors_ to the concrete classes will not have any
      impact on existing code, neither on code using the classes nor on
      code defining custom collections. So adding at least the constructors
      won't be a problem.

      Second, by putting reasonable default implementations for the new methods
      into the definition helper classes ('AbstractList', etc.),
      implementations of customer collections would not be hit by the change,
      if they extend those helpers. To do so is recommended practice, anyway!
      Of course, if a custom class wants to cope with the new methods, it can override
      them for optimized handling at any time later. It is very easy for the new
      methods to have a reasonable default implementation, as I already showed in the RFE.

      But now I see one really dangerous point, indeed: In Java, it is not always
      possible to extend a class, because multiple inheritance is not
      allowed, so sometimes one _must_ implement the interface instead of
      extending the helper class. One such case would be the application of
      the class adapter design pattern (Gamma e.a, "Design Patterns", p.139).
      For example, someone has some class A which already extends
      another class B with an interface similar to that of 'Collection'.
      To get this class into Java's Collection framework, he would add an
      'implements Collection' to the definition of class A. In such a case,
      his code would break after adding new methods to the Collection interface.

      So, while I do not agree with you, that one has to create interfaces
      in the first place (interfaces will always change, because external
      circumstances change; there is no such "everything done just right"
      in software development, and there will always come the time, where
      one has to break compatibility, to keep a language or software vivid),
      here is my

      ======================
      PROPOSED CHANGE TO RFE
      ======================

         * No change to RFE alluded to the _constructors_: They should
           still go into the concrete implementing classes of Collection
           (consider adding them also to the abstract helper classes).

         * Do _not_ add the proposed _methods_ to the Collection interface.

         * Add implementations of the proposed _methods_ to the concrete implementing classes
           of Collection and also to the abstract helper classes.

         * Add according class methods to class 'j.u.Collections'. These are:

            * static boolean addAll(Collection c, Iterable iterable)
            * static boolean containsAll(Collection c, Iterable iterable)
            * static boolean removeAll(Collection c, Iterable iterable)
            * static boolean retainAll(Collection c, Iterable iterable)
       
            * static boolean addAll(Collection c, Iterator iterator)
            * static boolean containsAll(Collection c, Iterator iterator)
            * static boolean removeAll(Collection c, Iterator iterator)
            * static boolean retainAll(Collection c, Iterator iterator)

           These class methods can internally check for the concrete type of the given collection
           and delegate to the according method. For unrecognized types (custom implementations) there
           will be a default implementation analoge to the implementation I suggested in the original RFE.

      After some years of transistion, one can then reconsider again to add the method
      declarations directly to the interfaces themself.

            Unassigned Unassigned
            ndcosta Nelson Dcosta (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Imported:
              Indexed: