https://github.com/openjdk/jdk/blob/01c29d8f2c865009c0d5379ba2e2cd4d3015f018/src/hotspot/share/oops/symbol.hpp#L158
unsigned identity_hash() const {
unsigned addr_bits = (unsigned)((uintptr_t)this >> (LogMinObjAlignmentInBytes + 3));
return ((unsigned)extract_hash(_hash_and_refcount) & 0xffff) |
((addr_bits ^ (length() << 8) ^ (( _body[0] << 8) | _body[1])) << 16);
}
The "+3" in the shift was introduced inJDK-8130115.
If I remember correctly, the intention was to avoid getting the same value for these bits:
(((uintptr_t)this) >> LogMinObjAlignmentInBytes) & 0x07)
However, it may not be necessary. The Symbols are of variable sizes (the string body is allocated as part of the Symbol).
I wrote a program to analyze the distribution of the above expression (see attachment count_syms.tcl). Its values are equally distributed (Linux/amd64).
This means it's safe to get rid of this suspicious +3 to make the code simpler.
==============
- check the distribution with CDS disabled:
$ java -XX:+UseNewCode -Xshare:off -cp ~/tmp HelloWorld | tclsh ~/count_syms.tcl
0: +++++++++++++++++
1: +++++++++++++++
2: +++++++++++++++++
3: ++++++++++++++++
4: ++++++++++++++++++
5: +++++++++++++++
6: ++++++++++++++++++
7: +++++++++++++++
0: 2506
1: 2161
2: 2499
3: 2245
4: 2548
5: 2174
6: 2537
7: 2134
- check the distribution for all symbols in the CDS archive
$ java -XX:+UseNewCode -Xshare:dump | tclsh ~/count_syms.tcl
0: ++++++++++++++++++++++++++++++++++
1: +++++++++++++++++++++++++++++++++++
2: ++++++++++++++++++++++++++++++++++
3: +++++++++++++++++++++++++++++++++++
4: ++++++++++++++++++++++++++++++++++
5: +++++++++++++++++++++++++++++++++++
6: +++++++++++++++++++++++++++++++++++
7: +++++++++++++++++++++++++++++++++++
0: 4842
1: 5017
2: 4880
3: 4946
4: 4804
5: 4932
6: 4948
7: 4912
unsigned identity_hash() const {
unsigned addr_bits = (unsigned)((uintptr_t)this >> (LogMinObjAlignmentInBytes + 3));
return ((unsigned)extract_hash(_hash_and_refcount) & 0xffff) |
((addr_bits ^ (length() << 8) ^ (( _body[0] << 8) | _body[1])) << 16);
}
The "+3" in the shift was introduced in
If I remember correctly, the intention was to avoid getting the same value for these bits:
(((uintptr_t)this) >> LogMinObjAlignmentInBytes) & 0x07)
However, it may not be necessary. The Symbols are of variable sizes (the string body is allocated as part of the Symbol).
I wrote a program to analyze the distribution of the above expression (see attachment count_syms.tcl). Its values are equally distributed (Linux/amd64).
This means it's safe to get rid of this suspicious +3 to make the code simpler.
==============
- check the distribution with CDS disabled:
$ java -XX:+UseNewCode -Xshare:off -cp ~/tmp HelloWorld | tclsh ~/count_syms.tcl
0: +++++++++++++++++
1: +++++++++++++++
2: +++++++++++++++++
3: ++++++++++++++++
4: ++++++++++++++++++
5: +++++++++++++++
6: ++++++++++++++++++
7: +++++++++++++++
0: 2506
1: 2161
2: 2499
3: 2245
4: 2548
5: 2174
6: 2537
7: 2134
- check the distribution for all symbols in the CDS archive
$ java -XX:+UseNewCode -Xshare:dump | tclsh ~/count_syms.tcl
0: ++++++++++++++++++++++++++++++++++
1: +++++++++++++++++++++++++++++++++++
2: ++++++++++++++++++++++++++++++++++
3: +++++++++++++++++++++++++++++++++++
4: ++++++++++++++++++++++++++++++++++
5: +++++++++++++++++++++++++++++++++++
6: +++++++++++++++++++++++++++++++++++
7: +++++++++++++++++++++++++++++++++++
0: 4842
1: 5017
2: 4880
3: 4946
4: 4804
5: 4932
6: 4948
7: 4912
- is blocked by
-
JDK-8269962 SA has unused Hashtable, Dictionary classes
-
- Resolved
-
- relates to
-
JDK-8130115 REDO - Reduce Symbol::_identity_hash to 2 bytes
-
- Resolved
-