-
Bug
-
Resolution: Not an Issue
-
P4
-
6, 6u23, 6u24
-
x86
-
linux, windows_xp, windows_7
FULL PRODUCT VERSION :
java version "1.6.0_02-ea"
Java(TM) SE Runtime Environment (build 1.6.0_02-ea-b02)
Java HotSpot(TM) Client VM (build 1.6.0_02-ea-b02, mixed mode, sharing)
ADDITIONAL OS VERSION INFORMATION :
Linux calmer 2.6.9-42.0.10.EL #1 Fri Feb 16 17:06:10 EST 2007 i686 i686 i386 GNU/Linux
A DESCRIPTION OF THE PROBLEM :
The contains method of the Set interface promises:
"Returns true if this set contains the specified element. More formally, returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e))."
The HashSet implementation of the contains method violates this contract by returning "false" for objects it contains, under some circumstances.
Using a HashSet one can set up a simple loop where each iteration fetches an element from the set, asks the set if it contains said element, and puts out a msg if the set does not contain the element it just gave you. Eg:
for (E elem : mySet)
if (!mySet.contains(elem))
print(errMsg);
Under certain common situations, you will actually see the error msg, leading to the bizarre situation where the set does not contain elements that it just gave you!
The problem stems from using a HashMap to back the set, using the hashCode of the elements as a key, and then assuming the hashCode of an element doesn't change once the element is placed in the set.
It is very common to use value-equality (as opposed to reference-equality) for objects that are to be used in a set. This means overriding the equals method to compare values. Since, according to the javadoc for Object.equals, "equal objects must have equal hash codes", this also means overriding hashCode. If the objects in the set are mutable, the hashCodes of the objects may change while they live in the set. If the initial hashCode was changed, the current implementation of HashSet.contains loses its knowledge about the containment of the object.
(I understand that one issue with placing mutable objects in a set means that the set may not be able to guarantee uniqueness of its elements -- that it can only be a gate-keeper of uniqueness upon an element's entry into a set, and that it cannot know if its elements have later been made equal behind its back. That issue is, i believe, well understood and accepted by developers as a risk of placing mutable objects in a set. This issue of a set saying "I don't contain the element i just gave you" is fundamentally different and a consequence of the way the set was implemented -- something which those of us who "program to the interface" would like to remain blissfully unaware.)
Right now HashSet.contains returns "map.containsKey(o)". It might instead return "true" if "map.containsKey(o)" is "true", but not return "false" until after it has done a more exhaustive search on all if its elements. Sure, this slows things down for objects whose hash codes have changed since entry into the set, but it does something more important: it upholds the contract it has with the consumers of the Set interface.
This issue is similar, but not identical, to Bug IDs 4802577 and 4459681 -- both of which were deemed "not a bug". Before you classify this as not a bug, imagine this scenario:
You are an object that receives Set<Foo> from some other object. You have no idea where the Set was originally created and populated, and no idea what implementation of Set was used. If you have an object, myFoo, that really is in the set, is it right for Set<Foo>'s contain method to return false? If you're programming to the interface, the answer is "no".
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Put the class found below in $JAVA_HOME/dmh/set (or change its package and put it where you like), then run it.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Termination of program with no messages.
ACTUAL -
A single message stating:
Element named '[foo]' was pulled from the set, but the set claims it does not contain that element!
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
package dmh.set;
import java.util.HashSet;
import java.util.Random;
import java.util.Set;
/** Test of adherence of HashSet to Set interface. */
public class HashSetTest
{
private static char NEXT_NAME = 'A';
private char name;
private boolean b1;
private boolean b2;
private char c1;
private char c2;
private int i1;
private int i2;
private double d1;
private double d2;
/** Make new instance with random values. */
public HashSetTest()
{
name = NEXT_NAME++;
Random rand = new Random();
b1 = rand.nextDouble() < 0.5;
b1 = rand.nextDouble() < 0.5;
c1 = (char)rand.nextInt();
c2 = (char)rand.nextInt();
i1 = rand.nextInt();
i2 = rand.nextInt();
d1 = rand.nextDouble() * rand.nextInt();
d2 = rand.nextDouble() * rand.nextInt();
}
/** Returns <i>true</i> if {@code o} is value-equal to this object. */
@Override
public boolean equals(Object o)
{
//Quick exit if o is null
if (o == null)
return false;
//Quick exit if o is this
if (o == this)
return true;
//Quick exit if classes are different
if (!o.getClass().equals(this.getClass()))
return false;
//Safe cast
HashSetTest other = (HashSetTest)o;
return
other.b1 == this.b1 && other.b2 == this.b2 &&
other.c1 == this.c1 && other.c2 == this.c2 &&
other.i1 == this.i1 && other.i2 == this.i2 &&
other.d1 == this.d1 && other.d2 == this.d2;
}
@Override
public int hashCode()
{
int result = 17;
result = 37 * result + Boolean.valueOf(b1).hashCode();
result = 37 * result + Boolean.valueOf(b2).hashCode();
result = 37 * result + new Character(c1).hashCode();
result = 37 * result + new Character(c2).hashCode();
result = 37 * result + new Integer(i1).hashCode();
result = 37 * result + new Integer(i2).hashCode();
result = 37 * result + new Double(d1).hashCode();
result = 37 * result + new Double(d2).hashCode();
return result;
}
//==========================================================================//
/**
* Creates a HashSet of elements, modifies one element, then tests to
* ensure that the set knows the modified element is still part of the
* set. Unfortunately, the HashSet's contains method loses the fact
* that the element is part of the set.
*/
public static void main(String[] args) throws Exception
{
Set<HashSetTest> testSet = new HashSet<HashSetTest>();
for (int e=0; e < 10; e++)
testSet.add(new HashSetTest());
//Ensure that set knows what it contains
testContainmentOfOwnElements(testSet);
//Grab an element of the set, modify it, and retest for containment.
Random rand = new Random();
HashSetTest element =
(HashSetTest)testSet.toArray()[rand.nextInt(testSet.size())];
element.b1 = !element.b1;
element.c1 /= 2;
element.i1 *= 3;
element.d1 *= element.d2;
testContainmentOfOwnElements(testSet);
}
/**
* Fetches every element from {@code set} and then asks {@code set}
* if it contains the element. For each element retrieved from the
* set that the set denies it contains, a message is displayed
* on the console.
*/
private static void testContainmentOfOwnElements(Set<HashSetTest> set)
{
for (HashSetTest elem : set)
{
if (!set.contains(elem))
System.out.println(
"Element named '" + elem.name +
"' was pulled from the set, but the set claims it does not contain that element!");
}
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
The above description is a vast simplification of our actual situation. We really have an object Foo which contains a "private Set<Foo> prerequisites" variable, which can lead to a tree of Foo. We use JAXB to put Foo into an external XML file. The Foo objects that are used as prerequisites use XML's IDREF type. When we put Foo to XML and then bring it back from XML into another Foo instance, our equals method fails on testing the sets of prerequisites. It looks like the Foo prereqs enter the set before they are fully formed (presumably due to resolving the IDREFs). This means their hash codes change after they enter the set, under the control of JAXB, not our app code.
We'll probably rework Foo's equals method so that we no longer have "other.fooSet.equals(this.fooSet)". Putting other.fooSet and this.fooSet into NEW sets just before testing for equality might work.
java version "1.6.0_02-ea"
Java(TM) SE Runtime Environment (build 1.6.0_02-ea-b02)
Java HotSpot(TM) Client VM (build 1.6.0_02-ea-b02, mixed mode, sharing)
ADDITIONAL OS VERSION INFORMATION :
Linux calmer 2.6.9-42.0.10.EL #1 Fri Feb 16 17:06:10 EST 2007 i686 i686 i386 GNU/Linux
A DESCRIPTION OF THE PROBLEM :
The contains method of the Set interface promises:
"Returns true if this set contains the specified element. More formally, returns true if and only if this set contains an element e such that (o==null ? e==null : o.equals(e))."
The HashSet implementation of the contains method violates this contract by returning "false" for objects it contains, under some circumstances.
Using a HashSet one can set up a simple loop where each iteration fetches an element from the set, asks the set if it contains said element, and puts out a msg if the set does not contain the element it just gave you. Eg:
for (E elem : mySet)
if (!mySet.contains(elem))
print(errMsg);
Under certain common situations, you will actually see the error msg, leading to the bizarre situation where the set does not contain elements that it just gave you!
The problem stems from using a HashMap to back the set, using the hashCode of the elements as a key, and then assuming the hashCode of an element doesn't change once the element is placed in the set.
It is very common to use value-equality (as opposed to reference-equality) for objects that are to be used in a set. This means overriding the equals method to compare values. Since, according to the javadoc for Object.equals, "equal objects must have equal hash codes", this also means overriding hashCode. If the objects in the set are mutable, the hashCodes of the objects may change while they live in the set. If the initial hashCode was changed, the current implementation of HashSet.contains loses its knowledge about the containment of the object.
(I understand that one issue with placing mutable objects in a set means that the set may not be able to guarantee uniqueness of its elements -- that it can only be a gate-keeper of uniqueness upon an element's entry into a set, and that it cannot know if its elements have later been made equal behind its back. That issue is, i believe, well understood and accepted by developers as a risk of placing mutable objects in a set. This issue of a set saying "I don't contain the element i just gave you" is fundamentally different and a consequence of the way the set was implemented -- something which those of us who "program to the interface" would like to remain blissfully unaware.)
Right now HashSet.contains returns "map.containsKey(o)". It might instead return "true" if "map.containsKey(o)" is "true", but not return "false" until after it has done a more exhaustive search on all if its elements. Sure, this slows things down for objects whose hash codes have changed since entry into the set, but it does something more important: it upholds the contract it has with the consumers of the Set interface.
This issue is similar, but not identical, to Bug IDs 4802577 and 4459681 -- both of which were deemed "not a bug". Before you classify this as not a bug, imagine this scenario:
You are an object that receives Set<Foo> from some other object. You have no idea where the Set was originally created and populated, and no idea what implementation of Set was used. If you have an object, myFoo, that really is in the set, is it right for Set<Foo>'s contain method to return false? If you're programming to the interface, the answer is "no".
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Put the class found below in $JAVA_HOME/dmh/set (or change its package and put it where you like), then run it.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Termination of program with no messages.
ACTUAL -
A single message stating:
Element named '[foo]' was pulled from the set, but the set claims it does not contain that element!
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
package dmh.set;
import java.util.HashSet;
import java.util.Random;
import java.util.Set;
/** Test of adherence of HashSet to Set interface. */
public class HashSetTest
{
private static char NEXT_NAME = 'A';
private char name;
private boolean b1;
private boolean b2;
private char c1;
private char c2;
private int i1;
private int i2;
private double d1;
private double d2;
/** Make new instance with random values. */
public HashSetTest()
{
name = NEXT_NAME++;
Random rand = new Random();
b1 = rand.nextDouble() < 0.5;
b1 = rand.nextDouble() < 0.5;
c1 = (char)rand.nextInt();
c2 = (char)rand.nextInt();
i1 = rand.nextInt();
i2 = rand.nextInt();
d1 = rand.nextDouble() * rand.nextInt();
d2 = rand.nextDouble() * rand.nextInt();
}
/** Returns <i>true</i> if {@code o} is value-equal to this object. */
@Override
public boolean equals(Object o)
{
//Quick exit if o is null
if (o == null)
return false;
//Quick exit if o is this
if (o == this)
return true;
//Quick exit if classes are different
if (!o.getClass().equals(this.getClass()))
return false;
//Safe cast
HashSetTest other = (HashSetTest)o;
return
other.b1 == this.b1 && other.b2 == this.b2 &&
other.c1 == this.c1 && other.c2 == this.c2 &&
other.i1 == this.i1 && other.i2 == this.i2 &&
other.d1 == this.d1 && other.d2 == this.d2;
}
@Override
public int hashCode()
{
int result = 17;
result = 37 * result + Boolean.valueOf(b1).hashCode();
result = 37 * result + Boolean.valueOf(b2).hashCode();
result = 37 * result + new Character(c1).hashCode();
result = 37 * result + new Character(c2).hashCode();
result = 37 * result + new Integer(i1).hashCode();
result = 37 * result + new Integer(i2).hashCode();
result = 37 * result + new Double(d1).hashCode();
result = 37 * result + new Double(d2).hashCode();
return result;
}
//==========================================================================//
/**
* Creates a HashSet of elements, modifies one element, then tests to
* ensure that the set knows the modified element is still part of the
* set. Unfortunately, the HashSet's contains method loses the fact
* that the element is part of the set.
*/
public static void main(String[] args) throws Exception
{
Set<HashSetTest> testSet = new HashSet<HashSetTest>();
for (int e=0; e < 10; e++)
testSet.add(new HashSetTest());
//Ensure that set knows what it contains
testContainmentOfOwnElements(testSet);
//Grab an element of the set, modify it, and retest for containment.
Random rand = new Random();
HashSetTest element =
(HashSetTest)testSet.toArray()[rand.nextInt(testSet.size())];
element.b1 = !element.b1;
element.c1 /= 2;
element.i1 *= 3;
element.d1 *= element.d2;
testContainmentOfOwnElements(testSet);
}
/**
* Fetches every element from {@code set} and then asks {@code set}
* if it contains the element. For each element retrieved from the
* set that the set denies it contains, a message is displayed
* on the console.
*/
private static void testContainmentOfOwnElements(Set<HashSetTest> set)
{
for (HashSetTest elem : set)
{
if (!set.contains(elem))
System.out.println(
"Element named '" + elem.name +
"' was pulled from the set, but the set claims it does not contain that element!");
}
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
The above description is a vast simplification of our actual situation. We really have an object Foo which contains a "private Set<Foo> prerequisites" variable, which can lead to a tree of Foo. We use JAXB to put Foo into an external XML file. The Foo objects that are used as prerequisites use XML's IDREF type. When we put Foo to XML and then bring it back from XML into another Foo instance, our equals method fails on testing the sets of prerequisites. It looks like the Foo prereqs enter the set before they are fully formed (presumably due to resolving the IDREFs). This means their hash codes change after they enter the set, under the control of JAXB, not our app code.
We'll probably rework Foo's equals method so that we no longer have "other.fooSet.equals(this.fooSet)". Putting other.fooSet and this.fooSet into NEW sets just before testing for equality might work.
- duplicates
-
JDK-6584544 (coll) Wrong Set.add(E) method implementation in HashSet
-
- Closed
-
-
JDK-7014079 java.util.HashSet's contains() does not honor java.util.Collection interface
-
- Closed
-
-
JDK-7027073 Description error in JavaDoc for java.util.HashSet
-
- Closed
-
- relates to
-
JDK-7030899 (coll) TreeSet uses the wrong method for equality detection during add(E)
-
- Closed
-