Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8288981

C2: Fix issues with skeleton predicates

XMLWordPrintable

      Since the introduction of skeleton predicates in JDK-8193130 to protect against over-unrolling a main loop, many new related bugs have been found and fixed - also for loop optimizations other than the main loop of pre/main/post (e.g. post loop, loop unswitching, loop peeling). The most recent bug fix (JDK-8283466) fixed a sub problem for loop peeling while there are still remaining unfixed cases for loop peeling (JDK-8278420) which are, however, rare in practice.

      Some recent changes (JDK-8230382) increase the probability of hitting these remaining problems with fuzzing but it's still hard to trigger this in practice.

      JDK-8286805 investigates stressing of various loop opts, including loop peeling. It tries to randomly apply different loop optimization even though the normal policy would not allow it. The idea is to only apply a loop optimization if the policy would not apply it due to inefficiency. This has revealed a lot of cases that can be traced back to some missing cases with skeleton predicates.

      To be able to reduce the noise in JDK-8286805, we should fix the missing skeleton predicate cases. To do that, we first want to redesign and refactor the skeleton predicates code and then fix the missing cases separately (like JDK-8278420).

      This RFE concerns the redesign and refactoring of skeleton predicates. Due to the many fixes that went in, the code became quite complex. On top of that, the naming scheme for all the different predicates is sometimes ambiguous and not consistent.


      In this RFE, we want to implement/perform the following ideas:
      - Find a good naming scheme to distinguish the different kinds of predicates. In the following, the currently existing names are used due to a lack of a better naming scheme as of today.
      - Group skeleton predicates together and place them right above a loop head:
        - Easier to find them again.
        - They should always be around until we are done with loop opts. This also allows us to rewire data nodes together with the cloning of the skeleton predicates.
        - They can still be found even though the empty predicates with the Conv2B/Opaque1 nodes are cleanup up (see call of Compile::cleanup_loop_predicates()).
        - The ordering of the skeleton predicates is most likely irrelevant.
      - Use Halt nodes instead of UCT when inserting the empty predicates to simplify the copying of skeleton predicates (they never trap) as we do not need to find a valid UCT (which might not always be found, e.g. after a peeled iteration).
      - Introduce a new OpaqueNode to better distinguish skeleton predicates from other predicates. This makes it easier to find them again and to reason about them when analyzing a graph.
      - Refactor code:
        - A lot of code looks similar and should be reused instead of copied.
        - Better method naming (e.g. cloning vs. copying).

      The redesign should fix JDK-8288941. Make sure to double check before the integration of this redesign.


      Skeleton Predicate Background
      ===================
      This section describes the basic idea of skeleton predicates.

      At the time we create a range check predicate RCP for a range check of an array access A inside a single (counted) loop L, the predicate covers the range of L. During runtime, we will trap before executing a single iteration of L if an array access would be outside of the allowed range. So, from the runtime perspective, it does not matter if we have a single loop L or if we split L into multiple sub loops that will together cover all the iterations of L.

      However, during C2 compilation we are facing problems when splitting L into different sub loops, lets say L1 and L2, while only having the single predicate RCP above all loops. In C2, data and control are separate. Data for the array access A (that uses the iv phi), could use type information of the updated loop bounds/updated iv phi for each individual loop L1 and L2. We could find that in L2, for example, an array access would be out of bounds. Data is replaced by TOP (done by Cast/Conv nodes that narrow a range and are replaced by TOP if the range becomes empty). This means that the loop L2 is never entered because the predicate RCP would trap during runtime. Therefore, the entire loop L2 should be removed - but that's not done. There is no check before L2 that could fold and let the loop die. We end up with a broken graph where some nodes died while others are still alive. To solve this, we insert a dummy skeleton predicate (that never fails at runtime because we would already trap wit RCP) for each sub loop of L that covers a part of the entire range of L. This allows L2 to die when we find that an array access is out of bounds.

      There are two places where we can split a loop into multiple sub loops. We need to insert a skeleton predicate for each sub loop:
      - Peeling: Split loop into "Peeled Iteration" + "Peeled Loop"
      - Pre/Main/Post: Split loop into "Pre Loop" + "Main Loop" + "Post Loop"

            chagedorn Christian Hagedorn
            chagedorn Christian Hagedorn
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated: