Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6880034

SIGBUS during deoptimisation at a safepoint on 64bit-SPARC

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • hs17
    • hs15, hs17
    • hotspot
    • b04
    • sparc
    • solaris
    • Verified

        Volker Simonis wrote:
        I discovered a problem during deoptimisation at a safepoint which
        leads to a SIGBUS on 64bit-SPARC. The problem was introduced by the
        change "6420645: Create a vm that uses compressed oops for up to 32gb
        heapsizes" which has been submitted by you. The problem is easily
        reproducible with the attached test program. Just run:

        java -d64 -server -showversion -Xcomp -Xbatch
        "-XX:CompileCommand=compileonly DeoptTest
        deopt_compiledframe_at_safepoint" -XX:+PrintCompilation DeoptTest

        and you will get a VM crash like:

        CompilerOracle: compileonly DeoptTest.deopt_compiledframe_at_safepoint
        java version "1.6.0_14"
        Java(TM) SE Runtime Environment (build 1.6.0_14-b08)
        Java HotSpot(TM) 64-Bit Server VM (build 14.0-b16, compiled mode)

          1 b DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
          1 made not entrant DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
        #
        # A fatal error has been detected by the Java Runtime Environment:
        #
        # SIGBUS (0xa) at pc=0xffffffff7e37514c, pid=9314, tid=15
        #
        # JRE version: 6.0_14-b08
        # Java VM: Java HotSpot(TM) 64-Bit Server VM (14.0-b16 compiled mode
        solaris-sparc )
        # Problematic frame:
        # V [libjvm.so+0x77514c]

        As noticed before, the error is a regression of change 6420645. It
        doesn't happen with earlier versions of the HotSpot. For example 6u13
        with HS 11 runs the test just fine:

        CompilerOracle: compileonly DeoptTest.deopt_compiledframe_at_safepoint
        java version "1.6.0_13"
        Java(TM) SE Runtime Environment (build 1.6.0_13-b03)
        Java HotSpot(TM) 64-Bit Server VM (build 11.3-b02, compiled mode)

          1 b DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
          1 made not entrant DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
        OK


        Notice that the problem is still present in the HS head revsion. I've
        tried with 7-ea b70:

        java version "1.7.0-ea"
        Java(TM) SE Runtime Environment (build 1.7.0-ea-b70)
        Java HotSpot(TM) 64-Bit Server VM (build 16.0-b07, compiled mode)

          1 b DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
          1 made not entrant DeoptTest::deopt_compiledframe_at_safepoint (220 bytes)
        #
        # A fatal error has been detected by the Java Runtime Environment:
        #
        # SIGBUS (0xa) at pc=0xffffffff7e38c7fc, pid=10714, tid=15
        #
        # JRE version: 7.0-b70
        # Java VM: Java HotSpot(TM) 64-Bit Server VM (16.0-b07 compiled mode
        solaris-sparc )
        # Problematic frame:
        # V [libjvm.so+0x78c7fc]


        The problem is caused be the following changes in frame.cpp:


        --- a/src/share/vm/runtime/frame.cpp Sat Dec 01 00:00:00 2007 +0000
        +++ b/src/share/vm/runtime/frame.cpp Sun Apr 13 17:43:42 2008 -0400
        @@ -1153,9 +1153,8 @@ oop* frame::oopmapreg_to_location(VMReg
             // If it is passed in a register, it got spilled in the stub frame.
             return (oop *)reg_map->location(reg);
           } else {
        - int sp_offset_in_stack_slots = reg->reg2stack();
        - int sp_offset = sp_offset_in_stack_slots >> (LogBytesPerWord -
        LogBytesPerInt);
        - return (oop *)&unextended_sp()[sp_offset];
        + int sp_offset_in_bytes = reg->reg2stack() * VMRegImpl::stack_slot_size;
        + return (oop*)(((address)unextended_sp()) + sp_offset_in_bytes);
           }
         }

        and the fact that reg->reg2stack() returns odd values for float
        registers >= F32. This finally leads to a BUS error due to an
        unaligned double read when the location of the register is accessed
        through the reg_map during deoptimisation in
        StackValue::create_stack_value(). In the old implementation, this was
        hidden by the right shift in frame::oopmapreg_to_location() which
        mapped F32 and F33 to the same stack offset.

        The problem can be easily solved by switching back to the old
        implementation of frame::oopmapreg_to_location(), but I assume there
        was a rational behind the change and that the new implementation is
        probably necessary for compressed oops (at least that's what the whole
        change was all about). So I dug a little further and found that in my
        opinion the root cause of the whole problem is the strange numbering
        of the 16 upper double registers in sparc.ad. They are defined as
        follows:

        reg_def R_D32x(SOC, SOC, Op_RegD,255, F32->as_VMReg());
        reg_def R_D32 (SOC, SOC, Op_RegD, 1, F32->as_VMReg()->next());
        reg_def R_D34x(SOC, SOC, Op_RegD,255, F34->as_VMReg());
        reg_def R_D34 (SOC, SOC, Op_RegD, 3, F34->as_VMReg()->next());
        ...
        reg_def R_D62x(SOC, SOC, Op_RegD,255, F62->as_VMReg());
        reg_def R_D62 (SOC, SOC, Op_RegD, 31, F62->as_VMReg()->next());

        This maps the invalid half (R_D32x, R_D34x, ..) of the double
        registers F32-F62 to even VMReg numbers (96, 98, ..) and the valid
        part (R_D32, R_D34, ..) to odd VMReg numbers (97, 99, ..). Later on,
        when the locals array for the safepoint is constructed in
        Compile::FillLocArray(), the call to OptoReg::as_VMReg(regnum) for a
        valid, even double register >= F32 (e.g. 96) returns the invalid, odd
        part (e.g. 97). This odd VMReg number is than stored in the Location
        part of the local and leads to the undesired behaviour in the new
        implementation of frame::oopmapreg_to_location() as described before.

        I don't know why this strange encoding has been chosen for the 16
        upper double registers in sparc.ad, but changing it to:

        reg_def R_D32x(SOC, SOC, Op_RegD,255, F32->as_VMReg()->next());
        reg_def R_D32 (SOC, SOC, Op_RegD, 1, F32->as_VMReg());
        reg_def R_D34x(SOC, SOC, Op_RegD,255, F34->as_VMReg()->next());
        reg_def R_D34 (SOC, SOC, Op_RegD, 3, F34->as_VMReg());
        ...
        reg_def R_D62x(SOC, SOC, Op_RegD,255, F62->as_VMReg()->next());
        reg_def R_D62 (SOC, SOC, Op_RegD, 31, F62->as_VMReg());

        which seems more natural to me, solved the SIGBUS issue and didn't
        revealed any other problems in the tests which I run so far.

        Could you please comment on the proposed solution of changing the
        VMReg numbering of F32-F62 or advice a better solution if you think
        that the proposed one will not work in the general case?

        Thank you and best regards,
        Volker

        PS: while I was hunting the error, I also stumbled across the
        following code in RegisterSaver::save_live_registers():

          // Save all the FP registers
          int offset = d00_offset;
          for( int i=0; i<64; i+=2 ) {
            FloatRegister f = as_FloatRegister(i);
            __ stf(FloatRegisterImpl::D, f, SP, offset+STACK_BIAS);
            map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), f->as_VMReg());
            if (true) {
              map->set_callee_saved(VMRegImpl::stack2reg((offset +
        sizeof(float))>>2), f->as_VMReg()->next());
            }
            offset += sizeof(double);
          }

        In my opinion, this could be changed to:

          // Save all the FP registers
          int offset = d00_offset;
          for( int i=0; i<64; i+=2 ) {
            FloatRegister f = as_FloatRegister(i);
            __ stf(FloatRegisterImpl::D, f, SP, offset+STACK_BIAS);
            map->set_callee_saved(VMRegImpl::stack2reg(offset>>2), f->as_VMReg());
            if (i < 32) { // VS 2009-08-31: the 16 upper double registers
        can't be used as floats anyway
              map->set_callee_saved(VMRegImpl::stack2reg((offset +
        sizeof(float))>>2), f->as_VMReg()->next());
            }
            offset += sizeof(double);
          }

        because the 16 upper double registers can't be used as floats anyway.
        Again, I didn't found any regression in my few tests. What do you
        think?

              kvn Vladimir Kozlov
              kvn Vladimir Kozlov
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: