Loading...

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: tbd
Affects Version/s: 21
Component/s: hotspot
Labels:
- metadata

Subcomponent:
runtime

The JVM spec defines various types of indices for accessing constant data. The most common type is a constant pool index (cp_index) for reading a specific element in the constant pool. For example, a getfield bytecode in a classfile is following by two bytes, which represent a 16-bit cp_index to the FieldRef constant associated with this bytecode.

When the classfile is loaded by HotSpot, many of the bytecodes are rewritten. For example, the two bytes following the getfield bytecode are rewritten to a different 16-bit value (the derived_index) to point to HotSpot's internal data structure (constantPoolCace). The effect of bytecode rewriting is implementation specific and is subject to change.

Currently, both the cp_index and the derived_index are of the int type.

E.g., in the code below, 'int idx' is a derived_index, which is required when calling cp->name_and_type_ref_index_at(int).

However, there's also a function cp->uncached_name_and_type_ref_index_at(int), which takes in a cp_index. Since both kind of indices are of the same C++ type, it's easy to get confused and call the wrong function.

https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/generateOopMap.cpp#L1600
void GenerateOopMap::interp1(BytecodeStream *itr) {
  [...]
  // abstract interpretation of current opcode
  switch(itr->code()) {
    case Bytecodes::_getstatic:do_field(true, true, itr->get_index_u2_cpcache(), itr->bci()); break;
    case Bytecodes::_putstatic:do_field(false, true, itr->get_index_u2_cpcache(), itr->bci()); break;

https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/generateOopMap.cpp#L1934

void GenerateOopMap::do_field(int is_get, int is_static, int idx, int bci) {
  // Dig up signature for field in constant pool
  ConstantPool* cp = method()->constants();
  int nameAndTypeIdx = cp->name_and_type_ref_index_at(idx);
  int signatureIdx = cp->signature_ref_index_at(nameAndTypeIdx);
  Symbol* signature = cp->symbol_at(signatureIdx);

A second problem is cp->name_and_type_ref_index_at(int derived_index) handles different bytecodes. As this version, it assumes that FieldRef and MethodRef are rewritten exactly the same way, and the derived_index for invokedynamic must be distinguisable from fields/methods:

https://github.com/openjdk/jdk/blob/8a70664e5248cd6b9d63951729e93bf73eff004c/src/hotspot/share/oops/constantPool.cpp#L689-L700

This makes it difficult to evolve the rewritter (e.g., to rewrite FieldRefs differently than MethodRefs)

===================================================
Proposal:

To avoid programming errors, and to improve flexibility in rewritting, we should encode the derived indices using distinct types. Here's a rough draft.

enum class FieldIndexType {....} // represents a rewritten index for fieldref
enum class MethodIndexType {....} // represents a rewritten index for methodref

Symbol* name;
int bci = ...;
Bytecodes::Code bc = method->code_at(bci);

if (bc == Bytecodes::_getfield) {
  FieldIndexType fld_index = method->field_index_at(bci);
  name = cp->field_name_at(fld_index);
} else if (bc == Bytecodes::_invokevirtual) {
  MethodIndexType mth_index = method->method_index_at(bci);
  name = cp->method_name_at(mth_index); // Can't pass mth_index to field_name_at().
} else ...

Note that we provide high level accessors for loading the derived indices to avoid explicit type casts. E.g., FieldIndexType Method::field_index_at(int bci). This way we can do more error checks (e.g., assert that bci indeed points to a field bytecode).

relates to

JDK-8307312 Replace "int which" with "int cp_index" in constantPool

Resolved

JDK-6711913 JVM needs linkage tables for efficient application loading

Closed

Details

Description

Attachments

Issue Links

Activity

People

Dates