-
Enhancement
-
Resolution: Unresolved
-
P5
-
None
-
6u31
-
x86
-
linux_ubuntu
FULL PRODUCT VERSION :
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Linux 3.0.0-12-generic #20-Ubuntu
A DESCRIPTION OF THE PROBLEM :
The method "AbstractCollection.retainAll(Collection<?> c)" is very
slow when the parameter "c" is empty (and "retainAll" should just
clear "this" collection). I attached a test that exposes this problem
and a one-line patch that fixes it. For this test, the patch provides
822X speedup on my machine.
The patch is very simple, just call:
if (c.isEmpty()) { clear(); return true; }
at the beginning of "retainAll".
To run the test, do:
$ java Test
The output for the un-patched version is:
Time is 822
The output for the patched version is:
Time is 1
There are other ways to speed up "retainAll", but they are more
complex, e.g., bug 5028425, whereas the above one-line patch is
trivial and provides considerable speedup for certain usage scenarios.
The method "AbstractCollection.removeAll(Collection<?> c)" can be
similarly optimized by adding "if (c.isEmpty()) return false;", but
the speed up is not as dramatic as for "retainAll".
Can you please fix this behavior?
The patch is:
======================================================================
--- AbstractCollection.java 2012-07-23 14:04:13.525513594 -0500
+++ AbstractCollection_Fix.java 2012-07-23 14:20:49.725537261 -0500
@@ -363,6 +363,7 @@
* @see #contains(Object)
*/
public boolean retainAll(Collection<?> c) {
+ if (c.isEmpty()) { clear(); return true; }
boolean modified = false;
Iterator<E> e = iterator();
while (e.hasNext()) {
======================================================================
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
$ java Test
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Time is 1
ACTUAL -
Time is 822
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
ArrayList<Integer> collection = new ArrayList<Integer>();
for (int i = 0; i < 100000; i++) collection.add(i);
ArrayList<Integer> other = new ArrayList<Integer>();
long start = System.currentTimeMillis();
collection.retainAll(other);
System.out.println("Time is " + (System.currentTimeMillis() - start));
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
I realize one can solve this problem from the caller site,
replacing "collection.retainAll(other)" with something like:
if (other.isEmpty()) collection.clear();
else collection.retainAll(other);
However, this clutters the caller sites a lot.
java version "1.6.0_24"
Java(TM) SE Runtime Environment (build 1.6.0_24-b07)
Java HotSpot(TM) 64-Bit Server VM (build 19.1-b02, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Linux 3.0.0-12-generic #20-Ubuntu
A DESCRIPTION OF THE PROBLEM :
The method "AbstractCollection.retainAll(Collection<?> c)" is very
slow when the parameter "c" is empty (and "retainAll" should just
clear "this" collection). I attached a test that exposes this problem
and a one-line patch that fixes it. For this test, the patch provides
822X speedup on my machine.
The patch is very simple, just call:
if (c.isEmpty()) { clear(); return true; }
at the beginning of "retainAll".
To run the test, do:
$ java Test
The output for the un-patched version is:
Time is 822
The output for the patched version is:
Time is 1
There are other ways to speed up "retainAll", but they are more
complex, e.g., bug 5028425, whereas the above one-line patch is
trivial and provides considerable speedup for certain usage scenarios.
The method "AbstractCollection.removeAll(Collection<?> c)" can be
similarly optimized by adding "if (c.isEmpty()) return false;", but
the speed up is not as dramatic as for "retainAll".
Can you please fix this behavior?
The patch is:
======================================================================
--- AbstractCollection.java 2012-07-23 14:04:13.525513594 -0500
+++ AbstractCollection_Fix.java 2012-07-23 14:20:49.725537261 -0500
@@ -363,6 +363,7 @@
* @see #contains(Object)
*/
public boolean retainAll(Collection<?> c) {
+ if (c.isEmpty()) { clear(); return true; }
boolean modified = false;
Iterator<E> e = iterator();
while (e.hasNext()) {
======================================================================
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
$ java Test
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Time is 1
ACTUAL -
Time is 822
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
import java.util.ArrayList;
public class Test {
public static void main(String[] args) {
ArrayList<Integer> collection = new ArrayList<Integer>();
for (int i = 0; i < 100000; i++) collection.add(i);
ArrayList<Integer> other = new ArrayList<Integer>();
long start = System.currentTimeMillis();
collection.retainAll(other);
System.out.println("Time is " + (System.currentTimeMillis() - start));
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
I realize one can solve this problem from the caller site,
replacing "collection.retainAll(other)" with something like:
if (other.isEmpty()) collection.clear();
else collection.retainAll(other);
However, this clutters the caller sites a lot.
- relates to
-
JDK-6394757 (coll) AbstractSet.removeAll semantics are surprisingly dependent on relative sizes
- Open