Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8313689

C2: compiler/c2/irTests/scalarReplacement/AllocationMergesTests.java fails intermittently with -XX:-TieredCompilation

XMLWordPrintable

    • b12

      On some machines (x64 and aarch), compiler/c2/irTests/scalarReplacement/AllocationMergesTests.java is failing due to allocations that could not have been removed with -XX:-TieredCompilation

      Playing around with different warm-ups (i.e. -DWarmup=1000,2000,10000 etc.) I get a different amount of failures. This suggests that on some machines, the number of warm-ups is enough for the test to work while on others it's not. However, when choosing a very high number of warm-ups (i.e. 10000), I even got 5 failures.

      We should check the root cause of being unable to remove allocations with a different number of warm-up iterations and fix the test/code accordingly.

      Output:

      Compilation of Failed Method
      ----------------------------
      1) Compilation of "int compiler.c2.irTests.scalarReplacement.AllocationMergesTests.testNoEscapeWithLoadInLoop_C2(boolean,int,int)":
      > Phase "PrintOptoAssembly":
      ----------------------- MetaData before Compile_id = 374 ------------------------
      {method}
       - this oop: 0x00007f1a134210d8
       - method holder: 'compiler/c2/irTests/scalarReplacement/AllocationMergesTests'
       - constants: 0x00007f1a1341c000 constant pool [641] {0x00007f1a1341c000} for 'compiler/c2/irTests/scalarReplacement/AllocationMergesTests' cache=0x00007f1a13424780
       - access: 0x0
       - flags: 0x5080 queued_for_compilation dont_inline has_loops_flag_init
       - name: 'testNoEscapeWithLoadInLoop_C2'
       - signature: '(ZII)I'
       - max stack: 5
       - max locals: 4
       - size of params: 4
       - method size: 14
       - vtable index: 13
       - i2i entry: 0x00007f1a98b50a40
       - adapters: AHE@0x00007f1aa42061b0: 0xbaaa i2c: 0x00007f1a98bb8600 c2i: 0x00007f1a98bb86f7 c2iUV: 0x00007f1a98bb86c5 c2iNCI: 0x00007f1a98bb8731
       - compiled entry 0x00007f1a98bb86f7
       - code size: 8
       - code start: 0x00007f1a134210c0
       - code end (excl): 0x00007f1a134210c8
       - method data: 0x00007f1a13499d20
       - checked ex length: 0
       - linenumber start: 0x00007f1a134210c8
       - localvar length: 0

      ------------------------ OptoAssembly for Compile_id = 374 -----------------------
      #
      # int ( compiler/c2/irTests/scalarReplacement/AllocationMergesTests:NotNull *, int, int, int )
      #
      000 N273: # out( B1 ) <- BLOCK HEAD IS JUNK Freq: 1
      000 movl rscratch1, [j_rarg0 + oopDesc::klass_offset_in_bytes()] # compressed klass
      decode_klass_not_null rscratch1, rscratch1
      cmpq rax, rscratch1 # Inline cache check
      jne SharedRuntime::_ic_miss_stub
      nop # nops to align entry point

              nop # 4 bytes pad for loops and calls

      020 B1: # out( B12 B2 ) <- BLOCK HEAD IS JUNK Freq: 1
      020 # stack bang (304 bytes)
      pushq rbp # Save rbp
      subq rsp, #64 # Create frame

      03a movl [rsp + #12], R8 # spill
      03f movl [rsp + #8], RCX # spill
      043 movq [rsp + #0], RSI # spill
      047 movl [rsp + #16], RDX # spill
      04b testl RDX, RDX
      04d je B12 P=0.100000 C=-1.000000

      053 B2: # out( B13 B3 ) <- in( B1 ) Freq: 0.9
      053 # TLS is in R15
      053 movq RAX, [R15 + #456 (32-bit)] # ptr
      05a movq R10, RAX # spill
      05d addq R10, #24 # ptr
      061 cmpq R10, [R15 + #472 (32-bit)] # raw ptr
      068 jae,u B13 P=0.000100 C=-1.000000

      06e B3: # out( B4 ) <- in( B2 ) Freq: 0.89991
      06e movq [R15 + #456 (32-bit)], R10 # ptr
      075 PREFETCHW [R10 + #192 (32-bit)] # Prefetch allocation into level 1 cache and mark modified
      07d movq [RAX], #1 # long
      084 movl [RAX + #8 (8-bit)], narrowklass: precise compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point: 0x00007f19e81bddc0:Constant:exact * # compressed klass ptr
      08b movl [RAX + #12 (8-bit)], R12 # int (R12_heapbase==0)
      08f movq [RAX + #16 (8-bit)], R12 # long (R12_heapbase==0)

      093 B4: # out( B16 B5 ) <- in( B14 B3 ) Freq: 0.9
      093
      093 MEMBAR-storestore (empty encoding)
      093 movq RBP, RAX # spill
      096 # checkcastPP of RBP
      096 movq RSI, RBP # spill
      099 movl RDX, [rsp + #12] # spill
      09d movl RCX, [rsp + #8] # spill
              nop # 2 bytes pad for loops and calls
      0a3 call,static compiler.c2.irTests.scalarReplacement.AllocationMergesTests$Point::<init>
              # compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop @ bci:24 (line 949) L[0]=rsp + #0 L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12 L[4]=#ScObj0 L[5]=#0 L[6]=_ STK[0]=RBP
              # ScObj0 compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point={ [x :0]=rsp + #8, [y :1]=rsp + #12 }
              # compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop_C2 @ bci:4 (line 962) L[0]=rsp + #0 L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12
              # OopMap {rbp=Oop [0]=Oop off=168/0xa8}

      0b0 B5: # out( B6 ) <- in( B4 ) Freq: 0.899982
              # Block is sole successor of call
      0b0 movl R10, [RBP + #16 (8-bit)] # int ! Field: compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point.y
      0b4 movl RAX, [RBP + #12 (8-bit)] # int ! Field: compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point.x

      0b7 B6: # out( B8 ) <- in( B5 B12 ) Freq: 0.999982
      0b7 leal R11, [RAX + R10]
      0bb leal R8, [R11 + #3342]
      0c2 movl R9, #3343 # int
      0c8 jmp,s B8
              nop # 6 bytes pad for loops and calls

      0d0 B7: # out( B8 ) <- in( B8 ) top-of-loop Freq: 903.513
      0d0 movl R9, RCX # spill

      0d3 B8: # out( B7 B9 ) <- in( B6 B7 ) Loop( B8-B7 inner main of N10) Freq: 904.513
      0d3 leal RBX, [R9 + R11]
      0d7 addl R8, RBX # int
      0da addl R8, RBX # int
      0dd addl R8, RBX # int
      0e0 addl R8, RBX # int
      0e3 addl R8, RBX # int
      0e6 addl R8, RBX # int
      0e9 addl R8, RBX # int
      0ec addl R8, RBX # int
      0ef addl R8, RBX # int
      0f2 addl R8, RBX # int
      0f5 addl R8, RBX # int
      0f8 addl R8, RBX # int
      0fb addl R8, RBX # int
      0fe addl R8, RBX # int
      101 addl R8, RBX # int
      104 addl R8, RBX # int
      107 addl R8, RBX # int
      10a addl R8, RBX # int
      10d addl R8, RBX # int
      110 addl R8, RBX # int
      113 addl R8, RBX # int
      116 addl R8, RBX # int
      119 addl R8, RBX # int
      11c addl R8, RBX # int
      11f addl R8, RBX # int
      122 addl R8, RBX # int
      125 addl R8, RBX # int
      128 addl R8, RBX # int
      12b addl R8, RBX # int
      12e addl R8, RBX # int
      131 addl R8, RBX # int
      134 addl R8, RBX # int
      137 addl R8, #496 # int
      13e leal RCX, [R9 + #32]
      142 cmpl RCX, #4207
      148 jl,s B7 # loop end P=0.998894 C=20781.000000

      14a B9: # out( B10 ) <- in( B8 ) Freq: 0.999982
      14a # castII of R9
      14a addl R9, #32 # int

      14e B10: # out( B10 B11 ) <- in( B9 B10 ) Loop( B10-B10 inner post of N309) Freq: 1.99996
      14e leal RCX, [R11 + R9]
      152 addl R8, RCX # int
      155 incl R9 # int
              nop # 8 bytes pad for loops and calls
      160 cmpl R9, #4234
      167 jl,s B10 # loop end P=0.500000 C=20781.000000

      169 B11: # out( N273 ) <- in( B10 ) Freq: 0.999982
      169 addl RAX, R8 # int
      16c addl RAX, R10 # int
      16f addq rsp, 64 # Destroy frame
      popq rbp
      cmpq rsp, poll_offset[r15_thread]
      ja #safepoint_stub # Safepoint: poll for GC

      181 ret

      182 B12: # out( B6 ) <- in( B1 ) Freq: 0.1
      182 movl R10, R8 # spill
      185 movl RAX, RCX # spill
      187 jmp B6

      18c B13: # out( B15 B14 ) <- in( B2 ) Freq: 9.00149e-05
      18c movq RSI, precise compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point: 0x00007f19e81bddc0:Constant:exact * # ptr
      196 movq RBP, [rsp + #0] # spill
              nop # 1 bytes pad for loops and calls
      19b call,static wrapper for: _new_instance_Java
              # compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop @ bci:18 (line 949) L[0]=RBP L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12 L[4]=#ScObj0 L[5]=#0 L[6]=_
              # ScObj0 compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point={ [x :0]=rsp + #8, [y :1]=rsp + #12 }
              # compiler.c2.irTests.scalarReplacement.AllocationMergesTests::testNoEscapeWithLoadInLoop_C2 @ bci:4 (line 962) L[0]=RBP L[1]=rsp + #16 L[2]=rsp + #8 L[3]=rsp + #12
              # OopMap {rbp=Oop [0]=Oop off=416/0x1a0}

      1a8 B14: # out( B4 ) <- in( B13 ) Freq: 9.00131e-05
              # Block is sole successor of call
      1a8 jmp B4

      1ad B15: # out( B17 ) <- in( B13 ) Freq: 9.00149e-10
      1ad # exception oop is in rax; no code emitted
      1ad movq RSI, RAX # spill
      1b0 jmp,s B17

      1b2 B16: # out( B17 ) <- in( B4 ) Freq: 9e-06
      1b2 # exception oop is in rax; no code emitted
      1b2 movq RSI, RAX # spill

      1b5 B17: # out( N273 ) <- in( B16 B15 ) Freq: 9.0009e-06
      1b5 addq rsp, 64 # Destroy frame
      popq rbp

      1ba jmp rethrow_stub

      --------------------------------------------------------------------------------

      STDERR:

      Command Line:
      /scratch/chagedor/jdk/open/jdk-22/fastdebug/bin/java -DReproduce=true -cp /scratch/chagedor/jdk/open/JTwork/classes/compiler/c2/irTests/scalarReplacement/AllocationMergesTests.d:/scratch/chagedor/jdk/open/test/hotspot/jtreg/compiler/c2/irTests/scalarReplacement:/scratch/chagedor/jdk/open/JTwork/classes/test/lib:/scratch/chagedor/jdk/open/JTwork/classes:/home/chagedor/jtreg/lib/javatest.jar:/home/chagedor/jtreg/lib/jtreg.jar:/home/chagedor/jtreg/lib/junit-platform-console-standalone-1.9.2.jar:/home/chagedor/jtreg/lib/testng-7.3.0.jar:/home/chagedor/jtreg/lib/jcommander-1.78.jar:/home/chagedor/jtreg/lib/guice-4.2.3.jar -Djava.library.path=. -Xbootclasspath/a:. -XX:+UnlockDiagnosticVMOptions -XX:+WhiteBoxAPI -DWarmup=2000 -XX:+CreateCoredumpOnCrash -ea -esa -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation -Dir.framework.server.port=37699 -XX:+UnlockDiagnosticVMOptions -XX:+ReduceAllocationMerges -XX:+TraceReduceAllocationMerges -XX:+DeoptimizeALot -XX:CompileCommand=exclude,*::dummy* -XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+LogCompilation -XX:CompilerDirectivesFile=test-vm-compile-commands-pid-13910.log -XX:CompilerDirectivesLimit=421 -XX:-OmitStackTraceInFastThrow -DShouldDoIRVerification=true -XX:-BackgroundCompilation -XX:CompileCommand=quiet compiler.lib.ir_framework.test.TestVM compiler.c2.irTests.scalarReplacement.AllocationMergesTests

      One or more @IR rules failed:

      Failed IR Rules (1) of Methods (1)
      ----------------------------------
      1) Method "int compiler.c2.irTests.scalarReplacement.AllocationMergesTests.testNoEscapeWithLoadInLoop_C2(boolean,int,int)" - [Failed IR rules: 1]:
         * @IR rule 1: "@compiler.lib.ir_framework.IR(applyIfCPUFeatureAnd={}, phase={DEFAULT}, applyIf={}, applyIfCPUFeatureOr={}, applyIfCPUFeature={}, counts={}, applyIfAnd={}, failOn={"_#ALLOC#_"}, applyIfOr={}, applyIfNot={})"
           > Phase "PrintOptoAssembly":
             - failOn: Graph contains forbidden nodes:
               * Constraint 1: "(.*precise .*\R((.*(?i:mov|mv|xorl|nop|spill).*|\s*)\R)*.*(?i:call,static).*wrapper for: _new_instance_Java)"
                 - Matched forbidden node:
                   * 18c movq RSI, precise compiler/c2/irTests/scalarReplacement/AllocationMergesTests$Point: 0x00007f19e81bddc0:Constant:exact * # ptr
                     196 movq RBP, [rsp + #0] # spill
                             nop # 1 bytes pad for loops and calls
                     19b call,static wrapper for: _new_instance_Java

            cslucas Cesar Soares
            chagedorn Christian Hagedorn
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: