Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8202772

NMT thread stack tracking causes crashes on AIX

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P4
    • 11
    • 11
    • hotspot
    • None
    • b18
    • aix

    Backports

      Description

        On AIX, we see:

        #
        # A fatal error has been detected by the Java Runtime Environment:
        #
        # Internal Error (/priv/d031900/openjdk/jdk-jdk/source/src/hotspot/share/services/virtualMemoryTracker.cpp:516), pid=24641784, tid=5141
        # assert(committed_size > 0 && is_aligned(committed_size, os::vm_page_size())) failed: Must be
        #
        # JRE version: OpenJDK Runtime Environment (11.0) (fastdebug build 11-internal+0-adhoc.d031900.source)
        # Java VM: OpenJDK 64-Bit Server VM (fastdebug 11-internal+0-adhoc.d031900.source, mixed mode, tiered, compressed oops, g1 gc, aix-ppc64)
        # No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
        #
        # If you would like to submit a bug report, please visit:
        # http://bugreport.java.com/bugreport/crash.jsp
        #

        Stack: [0x0000000117010000,0x000000011721d888], sp=0x000000011721bcc0, free space=2095k
        No context given, using current context.
        ------ current frame:
        iar: 0x0900000144fb9f18 libjvm.so::AixNativeCallstack::print_callstack_for_context(outputStream*,const ucontext_t*,bool,char*,unsigned long)+0x918 (C++ saves_cr saves_lr stores_bc gpr_saved:16 fixedparms:5 )
        lr: 0x0000000000000000 (unknown module)::(unknown function)+?
        sp: 0x000000011721a290 (base - 0x35F8)
        rtoc: 0x09001000a059c518
        |---stackaddr----| |----lrsave------|: <function name>
        0x000000011721a9c0 - 0x0900000141c0d980 libjvm.so::os::platform_print_native_stack(outputStream*,void*,char*,int)+0x20 (C++ saves_lr stores_bc fixedparms:4 )
        0x000000011721aa30 - 0x090000014234a6c4 libjvm.so::VMError::report(outputStream*,bool)+0x1a44 (C++ saves_lr stores_bc gpr_saved:12 fixedparms:2 )
        0x000000011721bad0 - 0x090000014234d304 libjvm.so::VMError::report_and_die(int,const char*,const char*,char*,Thread*,unsigned char*,void*,void*,const char*,int,unsigned long)+0x1e4 (C++ saves_cr saves_lr stores_bc gpr_saved:16 fixedparms:8 )
        0x000000011721bcd0 - 0x090000014234f168 libjvm.so::VMError::report_and_die(Thread*,void*,const char*,int,const char*,const char*,char*)+0x48 (C++ saves_lr stores_bc fixedparms:7 )
        0x000000011721bd60 - 0x0900000141a0f4f0 libjvm.so::report_vm_error(const char*,int,const char*,const char*,...)+0xf0 (C++ saves_lr stores_bc gpr_saved:4 fixedparms:8 parmsonstk:1)
        0x000000011721bdf0 - 0x0900000141e05400 libjvm.so::RegionIterator::next_committed(unsigned char*&,unsigned long&)+0x100 (C++ saves_lr stores_bc gpr_saved:7 fixedparms:3 )
        0x000000011721bea0 - 0x0900000141e01504 libjvm.so::SnapshotThreadStackWalker::do_allocation_site(const ReservedMemoryRegion*)+0x104 (C++ saves_lr stores_bc gpr_saved:4 fixedparms:2 )
        0x000000011721bfa0 - 0x0900000141dfb1d8 libjvm.so::VirtualMemorySummary::snapshot(VirtualMemorySnapshot*)+0x98 (C++ saves_lr stores_bc gpr_saved:3 fixedparms:1 )
        0x000000011721c040 - 0x0900000141e0f1d8 libjvm.so::MemBaseline::baseline(bool)+0xa78 (C++ saves_lr stores_bc gpr_saved:11 fixedparms:2 )
        0x000000011721c180 - 0x0900000144ea67a4 libjvm.so::NMTDCmd::execute(DCmdSource,Thread*)+0xbc4 (C++ saves_lr stores_bc gpr_saved:11 fixedparms:3 )
        0x000000011721d120 - 0x0900000142372144 libjvm.so::DCmd::parse_and_execute(DCmdSource,outputStream*,const char*,char,Thread*)+0xae4 (C++ saves_cr saves_lr stores_bc gpr_saved:18 fixedparms:5 )

        ----

        Reproduce with:

        java -XX:NativeMemoryTracking=summary -XX:+PrintNMTStatistics

        respectively

        gtestLauncher -jdk:./images/jdk/ --gtest_filter=CommittedVirtualMemoryTracker.test_committed_virtualmemory_region_test_vm

        both cases assert on AIX.

        --------------

        The problem is that NMT assumes stack boundaries to be page aligned. This is on most OSes the case, but does not necessarily have to be, and on AIX it is not. POSIX certainly does not require pthread stack boundaries to be page aligned.

        On AIX, stack boundaries are not aligned to page size. For the stack end, this does not matter: when retrieving the stack dimensions from the OS, we just align the stack boundary up to the next page size, where we will then place the thread stack guard pages. That is fine - the fact that the real pthread stack is actually a bit larger does not really matter much.

        However, wrt the stack base the matter is different. The reported stack base is also not stack aligned, and we cannot simply act as if it were.

        Attachments

          Issue Links

            Activity

              People

                stuefe Thomas Stuefe
                stuefe Thomas Stuefe
                Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved: