Improve recursive type-variable definition handling

XMLWordPrintable

    • Type: Enhancement
    • Resolution: Unresolved
    • Priority: P4
    • None
    • Affects Version/s: repo-babylon
    • Component/s: core-libs

      In https://github.com/openjdk/babylon/pull/849 [~mcimadamore] wrote:

      The problem of recursive type-variables is quite difficult to address in full.

      Some history: initially, our suport for type variables in JavaType was restricted to so called "type variable references". E.g. we wanted to capture a "nominal" reference to a variable declaration somewhere (e.g. a method, or a class). This approach would have worked, and would have been free of issue, even in the presence of recursion.

      But, as we kept working with the type support, we realized we wanted to have an erasure operation on JavaType (see JavaType::erasure). Now, to correctly compute the erasure of a type variable, we need to know its bound. This is why we eventually augmented TypeVariable to also store a bound type. This means that when you call TypeVariable::erasure you get the erasure of the type-variable's bound (which is what you want). This operation is used several times in our samples, in the bytecode generator, etc. (so I think we need something like this).

      The problem with recursive type-variable definition is that a type-variable bound can contain reference to itself -- as this PR demonstrates:

      `<X extends Enum<X>>`

      When that happens, it's important to be able to distinguish between a type-variable declaration and a type-variable use. Since a type-variable use doesn't really need any bound (but it implicitly points to a declaration -- which has the bound), no issue arises. But since code models conflate declaration and use, we have an issue.

      We could use the logic in this PR to detect a cycle -- and, if so, erase the bound. This is somewhat consistent with what we're doing in other cases -- e.g. when the type attached to a Java expression is too complex and contains types that are not denotable in the source code. One nice property of doing so is that erasing the bound doesn't negatively impact on JavaType::erasure. That is, that operation would still work as intended -- only, the type-variable bound would be a bit less sharp.

      If we want to do that, and detect cycles in all cases, we need to be careful, as recursion can come in many ways -- here's another:

      `<X extends List<Y>, Y extends List<X>>`

      In this case just checking whether, say, X is contained inside the bound of of X will fail. That bound is List<Y>, so X's bound does NOT contain X (but it contains Y, whose bound contains X...)

      In other words, in order to properly detect all cycles, we need to try and write down the bound as a JavaType. When doing so, we need to keep track of all the type-variables we encounter. If we encounter the same type-variable twice, we need to break out, and fall back with an erased upper bound. (logic such as this is used several times throughout the javac code base).

      Are there other alternatives?

      One thing that came up to me is that we could model recursive bounds such as the one in this PR with this factory:

      `static TypeVariableType typeVarRef(String name, TypeVariableType.Owner owner, Function<TypeVariableType, JavaType> boundFunc) { ... }`

      This factory will:

      create a "dummy" type-variable T
      pass that T to the bound function, which can create any JavaType containing T -- let's call such type B
      when the function is done, set B as the real bound in T and return T (this will use some mutable state in the implementation)
      However, there are issues with this approach:

      it doesn't work with groups of mutually dependent type-variables (see above for an example)
      even assuming we could create a TypeVariable object for such a recursive definition, how are we then to serialize such object in the code model text?
      Another possibility would be to ditch the type-variable bound from TypeVariable, and instead record the type-variable's erasure. Since the erasure cannot, by definition, contain any type-variables, such an approach would avoid the need to check for cycles, while still leaving support for JavaType::erasure intact.

      Note: we currently only use TypeVariable::bound in two cases:

      to compute the type-variable erasure (see above)
      to determine type-variable equality
      We do not use the bound to e.g. resolve TypeVariable to a j.l.r.Type.

      If we plan to "relax" type-variable bounds, I think we should also tweak equality not to rely on the bound information. E.g. treat bound as a best effort information -- equality should always be computed using the type variable owner + type variable name (as that pair is "unique" enough).

            Assignee:
            Unassigned
            Reporter:
            Adam Sotona
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: