Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8317507

C2 compilation fails with "Exceeded _node_regs array"

    XMLWordPrintable

Details

    • b22
    • Verified

    Backports

      Description

        java -Xmx1G -XX:+IgnoreUnrecognizedVMOptions -XX:CompileCommand=quiet -XX:CompileCommand=compileonly,*Test*::* -XX:-TieredCompilation -Xcomp -XX:+UnlockDiagnosticVMOptions -XX:+StressGCM -XX:UseAVX=2 Test.java

        # A fatal error has been detected by the Java Runtime Environment:
        #
        # Internal Error (.../open/src/hotspot/share/opto/regalloc.hpp:85), pid=1681802, tid=1681816
        # assert(idx < _node_regs_max_index) failed: Exceeded _node_regs array
        #
        # JRE version: Java(TM) SE Runtime Environment (22.0+17) (fastdebug build 22-ea+17-1342)
        # Java VM: Java HotSpot(TM) 64-Bit Server VM (fastdebug 22-ea+17-1342, compiled mode, sharing, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
        # Problematic frame:
        # V [libjvm.so+0x6ebf45] PhaseCFG::insert_goto_at(unsigned int, unsigned int)+0x695

        Current CompileTask:
        C2: 3200 109 b Test::vMeth (119 bytes)

        Stack: [0x00007f12d5057000,0x00007f12d5158000], sp=0x00007f12d5153e90, free space=1011k
        Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
        V [libjvm.so+0x6ebf45] PhaseCFG::insert_goto_at(unsigned int, unsigned int)+0x695 (regalloc.hpp:85)
        V [libjvm.so+0x6ede8c] PhaseCFG::fixup_flow()+0x1ac
        V [libjvm.so+0x9ee37d] Compile::Code_Gen()+0x4ad
        V [libjvm.so+0x9f107e] Compile::Compile(ciEnv*, ciMethod*, int, Options, DirectiveSet*)+0x1c9e
        ...

        FAILURE ANALYSIS

        The failure is caused by a seemingly legal but degenerate Ideal graph where around 94% of the nodes (544 out of 579 after Compile::Optimize()) are floating-point additions (AddF). On x64, these nodes, whose second operand is `inc` (see attached TestSimpler.java), are initially implemented with addF_reg_reg machine nodes. Register allocation spills `inc`, and then PhaseChaitin::fixup_spills() replaces each of the addF_reg_reg machine nodes with their memory-operand version (addF_reg_mem). PhaseRegAlloc has allocated 1118 elements for PhaseRegAlloc::_node_regs (612 + (612 >> 1) + 200 as per PhaseRegAlloc::alloc_node_regs), but each transformation from addF_reg_reg to addF_reg_mem creates a fresh node ID (Compile::next_unique()) and Compile::_unique eventually grows beyond the size of PhaseRegAlloc::_node_regs, which finally triggers the assertion failure when PhaseRegAlloc::set_pair is called for a newly created node post-register allocation (e.g. during the target-dependent peephole phase).

        The reason why this failure occurs only after JDK-8287087 is that this changeset makes it possible to detect a reduction chain that was undetectable before, when the innermost loop as been fully unrolled:

            static float test(float inc) {
                int i = 0, j = 0;
                float f = dontInline();
                while (i++ < 128) {
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                    f += inc;
                }
                return f;
            }

        This stronger analysis result provided by JDK-8287087 leads to the SLP early unrolling policy (SuperWord::unrolling_analysis()) requesting additional unrolling of the outermost loop, but due to limitations in the superword framework, the loop is finally not vectorized, leaving a graph with a very high density of AddF nodes (512 AddF nodes in the main loop body).

        Potential solutions include:
        - reusing the node ID of the replaced nodes in PhaseChaitin::fixup_spills() and/or adjusting Compile::_unique appropriately,
        - resizing _node_regs on an out-of-bounds attempt (e.g. using a growable array),
        - further increasing the size of _node_regs, and
        - adjusting the loop unrolling policy to avoid excessive unrolling for pure reduction loops.
        A temporary workaround is to use -XX:-UseCISCSpill.

        Attachments

          1. FuzzerUtils.java
            13 kB
          2. Manual.java
            9 kB
          3. Test.java
            7 kB
          4. TestSimple.java
            0.7 kB
          5. TestSimpler.java
            0.5 kB

          Issue Links

            Activity

              People

                rcastanedalo Roberto Castaneda Lozano
                thartmann Tobias Hartmann
                Votes:
                0 Vote for this issue
                Watchers:
                8 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: