FULL PRODUCT VERSION :
A DESCRIPTION OF THE PROBLEM :
The javadoc for java.util.Collection.size() states:
If this collection contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
This "feature" has been inherited by List, Set, and Map in their size() methods:
If this list contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
If this set contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
If the map contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
The ramifications of allowing over-sized collections do not seem to have been thought out and lead to some subtle problems. Particularly, this affects Lists:
1. When iterating a list bigger than Integer.MAX_VALUE, it is not clear what ListIterator.nextIndex() and ListIterator.previousIndex() should return for far elements. For example, the doc for nextIndex states: "Returns the index of the element that would be returned by a subsequent call to next(). (Returns list size if the list iterator is at the end of the list.)"
2. List.indexOf states that it: "Returns the index of the first occurrence of the specified element in this list, or -1 if this list does not contain the element. More formally, returns the lowest index i such that (o==null ? get(i)==null : o.equals(get(i))), or -1 if there is no such index." This definition is in conflict: the list might "contain" the element, but it cannot return the index for it if it is beyond Integer.MAX_VALUE.
3. List.lastIndexOf states that it: "Returns the index of the last occurrence of the specified element in this list, or -1 if this list does not contain the element. More formally, returns the highest index i such that (o==null ? get(i)==null : o.equals(get(i))), or -1 if there is no such index." This has the same problem as indexOf, but more so. The formal definition implies that if the list contains the element but at an index beyond the Integer.MAX_VALUE limit, the method would be obligated to search for the element in another position below the limit (in order to return the "highest index i" satisfying the expression), but then this would not actually be the "last occurrence" of the element. Class LinkedList also has a method called removeLastOccurrence inherited from Deque, which "Removes the last occurrence of the specified element in this list (when traversing the list from head to tail)". This doesn't say anything about indexes, so apparently the element that would be removed by calling ll.removeLastOccurrence(e) is not necessarily in the same position as the one that would be removed by calling ll.remove(ll.lastIndexOf(e));
I believe the best fix for these issues is to remove the statement in List.size() that implies lists may be larger than Integer.MAX_VALUE. Like thread-safe GUIs, it probably sounded like a good idea but it doesn't work properly in practice, and it wouldn't be very useful even if it did because so many of List's methods depend on an int index. Instead, List should specify Integer.MAX_VALUE as a hard limit on the size of any implementation. This solves a lot of problems, while only creating one.
I'm not certain how this might affect existing code, but if there were any major java.util.List implementations that could grow over Integer.MAX_VALUE, the problems this causes probably would have been noticed by now. In any case, changing the specification won't cause any code to immediately break. It's probably better to sort this out sooner rather than later, since ever-growing memory sizes will gradually turn this from a theoretical edge case into a practical problem.
I think it's okay to leave the same statement in the javadoc of Collection, Set, and Map. Since these classes are not ruled by indices in the way that Lists are, it's probably not a major issue to permit them to be over-sized.
----
In a semi-related issue, LinkedList is one real list implementation that may become larger than Integer.MAX_VALUE, as it doesn't deliberately limit its own size. However, it malfunctions once it becomes this large. Its size, which is recorded using a 32-bit int, quietly overflows and the size() method returns garbage. Even its iterator malfunctions as its hasNext() method checks that nextIndex < size. Fixing the class to work properly when it becomes over-sized is complicated and/or impossible. Instead, it should immediately throw an OutOfMemoryError if an attempt is made to grow the list beyond Integer.MAX_VALUE. Since neither its size() method nor its iterator work properly for such large lists anyway, this change is extremely unlikely to break any existing code.
There may be other collection methods and classes that, like LinkedList, potentially malfunction now that we have 64-bit memory spaces allowing many more objects. I didn't check them all.
REPRODUCIBILITY :
This bug can be reproduced always.
A DESCRIPTION OF THE PROBLEM :
The javadoc for java.util.Collection.size() states:
If this collection contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
This "feature" has been inherited by List, Set, and Map in their size() methods:
If this list contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
If this set contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
If the map contains more than Integer.MAX_VALUE elements, returns Integer.MAX_VALUE.
The ramifications of allowing over-sized collections do not seem to have been thought out and lead to some subtle problems. Particularly, this affects Lists:
1. When iterating a list bigger than Integer.MAX_VALUE, it is not clear what ListIterator.nextIndex() and ListIterator.previousIndex() should return for far elements. For example, the doc for nextIndex states: "Returns the index of the element that would be returned by a subsequent call to next(). (Returns list size if the list iterator is at the end of the list.)"
2. List.indexOf states that it: "Returns the index of the first occurrence of the specified element in this list, or -1 if this list does not contain the element. More formally, returns the lowest index i such that (o==null ? get(i)==null : o.equals(get(i))), or -1 if there is no such index." This definition is in conflict: the list might "contain" the element, but it cannot return the index for it if it is beyond Integer.MAX_VALUE.
3. List.lastIndexOf states that it: "Returns the index of the last occurrence of the specified element in this list, or -1 if this list does not contain the element. More formally, returns the highest index i such that (o==null ? get(i)==null : o.equals(get(i))), or -1 if there is no such index." This has the same problem as indexOf, but more so. The formal definition implies that if the list contains the element but at an index beyond the Integer.MAX_VALUE limit, the method would be obligated to search for the element in another position below the limit (in order to return the "highest index i" satisfying the expression), but then this would not actually be the "last occurrence" of the element. Class LinkedList also has a method called removeLastOccurrence inherited from Deque, which "Removes the last occurrence of the specified element in this list (when traversing the list from head to tail)". This doesn't say anything about indexes, so apparently the element that would be removed by calling ll.removeLastOccurrence(e) is not necessarily in the same position as the one that would be removed by calling ll.remove(ll.lastIndexOf(e));
I believe the best fix for these issues is to remove the statement in List.size() that implies lists may be larger than Integer.MAX_VALUE. Like thread-safe GUIs, it probably sounded like a good idea but it doesn't work properly in practice, and it wouldn't be very useful even if it did because so many of List's methods depend on an int index. Instead, List should specify Integer.MAX_VALUE as a hard limit on the size of any implementation. This solves a lot of problems, while only creating one.
I'm not certain how this might affect existing code, but if there were any major java.util.List implementations that could grow over Integer.MAX_VALUE, the problems this causes probably would have been noticed by now. In any case, changing the specification won't cause any code to immediately break. It's probably better to sort this out sooner rather than later, since ever-growing memory sizes will gradually turn this from a theoretical edge case into a practical problem.
I think it's okay to leave the same statement in the javadoc of Collection, Set, and Map. Since these classes are not ruled by indices in the way that Lists are, it's probably not a major issue to permit them to be over-sized.
----
In a semi-related issue, LinkedList is one real list implementation that may become larger than Integer.MAX_VALUE, as it doesn't deliberately limit its own size. However, it malfunctions once it becomes this large. Its size, which is recorded using a 32-bit int, quietly overflows and the size() method returns garbage. Even its iterator malfunctions as its hasNext() method checks that nextIndex < size. Fixing the class to work properly when it becomes over-sized is complicated and/or impossible. Instead, it should immediately throw an OutOfMemoryError if an attempt is made to grow the list beyond Integer.MAX_VALUE. Since neither its size() method nor its iterator work properly for such large lists anyway, this change is extremely unlikely to break any existing code.
There may be other collection methods and classes that, like LinkedList, potentially malfunction now that we have 64-bit memory spaces allowing many more objects. I didn't check them all.
REPRODUCIBILITY :
This bug can be reproduced always.