The JVM spec defines various types of indices for accessing constant data. The most common type is a constant pool index (cp_index) for reading a specific element in the constant pool. For example, a getfield bytecode in a classfile is following by two bytes, which represent a 16-bit cp_index to the FieldRef constant associated with this bytecode.
When the classfile is loaded by HotSpot, many of the bytecodes are rewritten. For example, the two bytes following the getfield bytecode are rewritten to a different 16-bit value (the derived_index) to point to HotSpot's internal data structure (constantPoolCace). The effect of bytecode rewriting is implementation specific and is subject to change.
Currently, both the cp_index and the derived_index are of the int type.
E.g., in the code below, 'int idx' is a derived_index, which is required when calling cp->name_and_type_ref_index_at(int).
However, there's also a function cp->uncached_name_and_type_ref_index_at(int), which takes in a cp_index. Since both kind of indices are of the same C++ type, it's easy to get confused and call the wrong function.
https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/generateOopMap.cpp#L1600
void GenerateOopMap::interp1(BytecodeStream *itr) {
[...]
// abstract interpretation of current opcode
switch(itr->code()) {
case Bytecodes::_getstatic:do_field(true, true, itr->get_index_u2_cpcache(), itr->bci()); break;
case Bytecodes::_putstatic:do_field(false, true, itr->get_index_u2_cpcache(), itr->bci()); break;
https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/generateOopMap.cpp#L1934
void GenerateOopMap::do_field(int is_get, int is_static, int idx, int bci) {
// Dig up signature for field in constant pool
ConstantPool* cp = method()->constants();
int nameAndTypeIdx = cp->name_and_type_ref_index_at(idx);
int signatureIdx = cp->signature_ref_index_at(nameAndTypeIdx);
Symbol* signature = cp->symbol_at(signatureIdx);
A second problem is cp->name_and_type_ref_index_at(int derived_index) handles different bytecodes. As this version, it assumes that FieldRef and MethodRef are rewritten exactly the same way, and the derived_index for invokedynamic must be distinguisable from fields/methods:
https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/constantPool.cpp#L689-L700
This makes it difficult to evolve the rewritter (e.g., to rewrite FieldRefs differently than MethodRefs)
===================================================
Proposal:
To avoid programming errors, and to improve flexibility in rewritting, we should encode the derived indices using distinct types. Here's a rough draft.
enum class FieldIndexType {....} // represents a rewritten index for fieldref
enum class MethodIndexType {....} // represents a rewritten index for methodref
Symbol* name;
int bci = ...;
Bytecodes::Code bc = method->code_at(bci);
if (bc == Bytecodes::_getfield) {
FieldIndexType fld_index = method->field_index_at(bci);
name = cp->field_name_at(fld_index);
} else if (bc == Bytecodes::_invokevirtual) {
MethodIndexType mth_index = method->method_index_at(bci);
name = cp->method_name_at(mth_index); // Can't pass mth_index to field_name_at().
} else ...
Note that we provide high level accessors for loading the derived indices to avoid explicit type casts. E.g., FieldIndexType Method::field_index_at(int bci). This way we can do more error checks (e.g., assert that bci indeed points to a field bytecode).
When the classfile is loaded by HotSpot, many of the bytecodes are rewritten. For example, the two bytes following the getfield bytecode are rewritten to a different 16-bit value (the derived_index) to point to HotSpot's internal data structure (constantPoolCace). The effect of bytecode rewriting is implementation specific and is subject to change.
Currently, both the cp_index and the derived_index are of the int type.
E.g., in the code below, 'int idx' is a derived_index, which is required when calling cp->name_and_type_ref_index_at(int).
However, there's also a function cp->uncached_name_and_type_ref_index_at(int), which takes in a cp_index. Since both kind of indices are of the same C++ type, it's easy to get confused and call the wrong function.
https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/generateOopMap.cpp#L1600
void GenerateOopMap::interp1(BytecodeStream *itr) {
[...]
// abstract interpretation of current opcode
switch(itr->code()) {
case Bytecodes::_getstatic:do_field(true, true, itr->get_index_u2_cpcache(), itr->bci()); break;
case Bytecodes::_putstatic:do_field(false, true, itr->get_index_u2_cpcache(), itr->bci()); break;
https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/generateOopMap.cpp#L1934
void GenerateOopMap::do_field(int is_get, int is_static, int idx, int bci) {
// Dig up signature for field in constant pool
ConstantPool* cp = method()->constants();
int nameAndTypeIdx = cp->name_and_type_ref_index_at(idx);
int signatureIdx = cp->signature_ref_index_at(nameAndTypeIdx);
Symbol* signature = cp->symbol_at(signatureIdx);
A second problem is cp->name_and_type_ref_index_at(int derived_index) handles different bytecodes. As this version, it assumes that FieldRef and MethodRef are rewritten exactly the same way, and the derived_index for invokedynamic must be distinguisable from fields/methods:
https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/constantPool.cpp#L689-L700
This makes it difficult to evolve the rewritter (e.g., to rewrite FieldRefs differently than MethodRefs)
===================================================
Proposal:
To avoid programming errors, and to improve flexibility in rewritting, we should encode the derived indices using distinct types. Here's a rough draft.
enum class FieldIndexType {....} // represents a rewritten index for fieldref
enum class MethodIndexType {....} // represents a rewritten index for methodref
Symbol* name;
int bci = ...;
Bytecodes::Code bc = method->code_at(bci);
if (bc == Bytecodes::_getfield) {
FieldIndexType fld_index = method->field_index_at(bci);
name = cp->field_name_at(fld_index);
} else if (bc == Bytecodes::_invokevirtual) {
MethodIndexType mth_index = method->method_index_at(bci);
name = cp->method_name_at(mth_index); // Can't pass mth_index to field_name_at().
} else ...
Note that we provide high level accessors for loading the derived indices to avoid explicit type casts. E.g., FieldIndexType Method::field_index_at(int bci). This way we can do more error checks (e.g., assert that bci indeed points to a field bytecode).
- relates to
-
JDK-8307312 Replace "int which" with "int cp_index" in constantPool
- Resolved
-
JDK-6711913 JVM needs linkage tables for efficient application loading
- Closed