Loading...

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Affects Version/s: 5.0
Component/s: core-libs
Labels:
- jsr166
- webbug

Subcomponent:
java.util:collections
Understanding:
Cause Known
CPU:

x86
OS:

windows_xp

A DESCRIPTION OF THE REQUEST :
Proposal
=======

I propose adding to interface 'java.util.Collection'
the following methods for handling iterable sequences:

  a: Methods receiving instances of interface 'java.util.Iterable':

boolean addAll(Iterable<? extends T> iterable)
boolean containsAll(Iterable<? extends T> iterable)
boolean removeAll(Iterable<? extends T> iterable)
boolean retainAll(Iterable<? extends T> iterable)

  b: Methods receiving instances of interface 'java.util.Iterator':

boolean addAll(Iterator<? extends T> iterator)
boolean containsAll(Iterator<? extends T> iterator)
boolean removeAll(Iterator<? extends T> iterator)
boolean retainAll(Iterator<? extends T> iterator)

Especially, the interface 'java.util.List' should receive
the following additional methods:

  a: Methods receiving instances of interface 'java.util.Iterable':

    boolean addAll(int index, Iterable<? extends T> iterable)

  b: Methods receiving instances of interface 'java.util.Iterator':

    boolean addAll(int index, Iterator<? extends T> iterator)

For the concrete collections within the Collections framework
(LinkedList, HashSet, etc.) I further suggest adding the
following constructors:

  a: Constructors receiving instances of interface 'java.util.Iterable':

    public ConcreteCollection(Iterable<? extends T> iterable)

  b: Constructors receiving instances of interface 'java.util.Iterator':

    public ConcreteCollection(Iterator<? extends T> iterator)

The semantics of those methods/constructors would be analoge to the
semantics of the according existing methods/constructors,
which receive a 'Collection<? extends T>' as their argument:
Instead of a collection, which itself is an iterable sequence
(Collection extends Iterable), the methods would receive
an iterable sequence as an object or represented by an iterator.

Implementation
============

The implementation of the new methods/constructors
will be possible in a rather straightforward form.
As an example, here is code for 'ArrayList'.

  public LinkedList(Iterable<? extends T> iterable) {
    this(iterable.iterator());
  }

  public LinkedList(Iterator<? extends T> iterator) {
    super();
    this.addAll(iterator);
  }

  public void addAll(Iterable<? extends T> iterable) {
    this.addAll(iterable.iterator());
  }

  public void addAll(Iterator<? extends T> iterator) {
    while (iterator.hasNext()) {
        this.add(iterator.next());
    }
  }

It should be mentioned that the last of those methods can be implemented
in a more efficient way by directly manipulating the internal
representation of the list. See the implementation of
'java.util.LinkedList(int, Collection<? extends T>)' !

After adding the proposed methods, one can change the
implementation of the 'Constructor<...>' versions of the methods
to become simple delegates to the 'Iterable<...>' versions. This
would avoid duplication of similar code. For instance:

  public void addAll(Collection<? extends T> c) {
    this.addAll( (Iterable)c );
  }

JUSTIFICATION :
Reasoning
=========

a: Reasoning for the 'Iterable<...>' methods/constructors
-----------------------------------------------------------

There is some need for creating new collections from existing
collections of objects, or to merge two existing collections.
In Java 1.5 (and in the upcoming 1.6, too), we only have a constructor
for creating e.g. a new List from a j.u.Collection.
Until 1.4 this was already a slight problem, in that there were already
other ways to represent collections of objects: Not just by "official"
Java CollectionS, but also by IteratorS. But with the introduction
of the Iterable interface in Java 1.5, together with the
addition of the for-each loop, the definition of iterators for
user defined classes will presumably become a common task in Java.
So this should also be recognized in the definition of the standard APIs.
Currently I have to define utility functions for this purpose in all
my projects. While this is very easy to do (see section "Implementation"),
I feel this is not the right thing to do -- it should be there out of the box.

b: Reasoning for the 'Iterator<...>' methods/constructors
-----------------------------------------------------------

While discussing this new methods in the Java forum,
most discussion was on the question whether to add
the 'Iterator<...>' methods or not.
They would indeed not be necessary, if every class,
for which an iterator exists, would implement
the 'Iterator' interface. This makes sense for newly defined classes,
but there are several existing classes, which do not currently
implement 'Iterable', and some of those classes won't be
able to do this later on.

For example, the "Jena Semantic Web" framework, hosted on
<jena.sourceforge.net>, has a key concept called "Model",
which actually represents an RDF graph. You can query such a Model
for RDF triples by the model's 'list' methods, which will return you
an 'Iterator' representing the result set of the query.
Because Jena is a pre-J1.5 development, the Model interface
does not extend the Iterator interface. And because Jena is the base
for a lot of existing code, this won't change that fast.

On the other hand, one could argue that it is not necessary
to support old code, when creating new features within a
new framework. I admit to have some sympathy with this thought,
so I would like to leave it open to further discussion, whether
to add the 'Iterator<...>' methods or not.

Compatibility
==========

This new feature would break no existing code, because it only adds
new methods and constructors to the existing framework. Besides that,
it would be a rather conservative feature:
The 'Iterable<...>' methods are just an extension
to the existing 'Collection<...>' methods,
and the 'Iterator<...>' methods are _pragmatically_ analoge
to the 'Iterable<...>' methods.

While it seems somewhat more resricted to think of a _sequence_ of objects
instead of a _collection_ of objects, this is defacto not a restriction.
The Javadoc specification of 'Collection.addAll' does not say anything
about the order in which the collection's entries are read, so in this
case, every order is welcome. And whenever, like for List.addAll',
an order is defined, it is defined in terms of the iterator returned by the
'iterator()' method, which itself implements the 'Iterable' interface.
So again no problem. Another point to consider is the 'Set' interface:
Here each entry of a sequence must only be inserted once, if it is not
already in. But this is also the definition for the
current 'Collection<...>' version of 'addAll', so the 'Iterable<...>'
and 'Iterator<...>' versions can be handled in the same way
with no problems.

It should be mentioned that there will be no clashes between
the current 'Collection<...>' methods and the new 'Iterable<...>' methods;
both versions of those methods can coexist.
When given a real 'Collection' instance, the 'Collection<...>' version
of the interface will be chosen by the compiler, so there will be
no danger of a hidden change in semantics after recompiling against
the new interface, and it even does not lose any performance.
The submitter provided the following SDN comment:

> Having methods that take Collections instead of Iterables does allow
> optimizations based on calling size().

I do not want to _replace_ the methods getting a Collection. As I
already pointed out in the RFE, it is perfectly possible for them
to coexist with the new methods, and so you can put all kinds of
optimized handling into them. But even in case of replacement,
you could do a simple runtime type check (by means of instanceof)
to handle CollectionS specially.

Comment added by :
> It might have been better to use the more general signatures
> suggested by the submitter, but interfaces can never be changed compatibly

This argument has several facets, some of them have already been
discussed in the forum.

First, adding _Constructors_ to the concrete classes will not have any
impact on existing code, neither on code using the classes nor on
code defining custom collections. So adding at least the constructors
won't be a problem.

Second, by putting reasonable default implementations for the new methods
into the definition helper classes ('AbstractList', etc.),
implementations of customer collections would not be hit by the change,
if they extend those helpers. To do so is recommended practice, anyway!
Of course, if a custom class wants to cope with the new methods, it can override
them for optimized handling at any time later. It is very easy for the new
methods to have a reasonable default implementation, as I already showed in the RFE.

But now I see one really dangerous point, indeed: In Java, it is not always
possible to extend a class, because multiple inheritance is not
allowed, so sometimes one _must_ implement the interface instead of
extending the helper class. One such case would be the application of
the class adapter design pattern (Gamma e.a, "Design Patterns", p.139).
For example, someone has some class A which already extends
another class B with an interface similar to that of 'Collection'.
To get this class into Java's Collection framework, he would add an
'implements Collection' to the definition of class A. In such a case,
his code would break after adding new methods to the Collection interface.

So, while I do not agree with you, that one has to create interfaces
in the first place (interfaces will always change, because external
circumstances change; there is no such "everything done just right"
in software development, and there will always come the time, where
one has to break compatibility, to keep a language or software vivid),
here is my

======================
PROPOSED CHANGE TO RFE
======================

   * No change to RFE alluded to the _constructors_: They should
     still go into the concrete implementing classes of Collection
     (consider adding them also to the abstract helper classes).

   * Do _not_ add the proposed _methods_ to the Collection interface.

   * Add implementations of the proposed _methods_ to the concrete implementing classes
     of Collection and also to the abstract helper classes.

   * Add according class methods to class 'j.u.Collections'. These are:

      * static boolean addAll(Collection c, Iterable iterable)
      * static boolean containsAll(Collection c, Iterable iterable)
      * static boolean removeAll(Collection c, Iterable iterable)
      * static boolean retainAll(Collection c, Iterable iterable)

      * static boolean addAll(Collection c, Iterator iterator)
      * static boolean containsAll(Collection c, Iterator iterator)
      * static boolean removeAll(Collection c, Iterator iterator)
      * static boolean retainAll(Collection c, Iterator iterator)

     These class methods can internally check for the concrete type of the given collection
     and delegate to the according method. For unrecognized types (custom implementations) there
     will be a default implementation analoge to the implementation I suggested in the original RFE.

After some years of transistion, one can then reconsider again to add the method
declarations directly to the interfaces themself.

relates to

JDK-6463989 (coll) Provide Iterable utility methods and light-weight Iterable implementations

Closed

Details

Description

Attachments

Issue Links

Activity

People

Dates