Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7197906

BlockOffsetArray::power_to_cards_back() needs to handle > 32 bit shifts

XMLWordPrintable

    • gc
    • b02
    • generic
    • generic

        Hal Mo reported the following issue on the hotspot-gc-dev alias:

        http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2012-September/004978.html

        Hi all,

        This is Hal Mo<kungu.mjh at taobao.com> from Alibaba Group(with OCA).

        Our hadoop namenode crashed, when we set the heap size to 135G using CMS GC.
        Attached please find the crash log(hs_err_pid.log).

        I can steadily reproduce the crash on a test machine with 190G physical
        memory, by a simple command:
        $ java -Xmx135g -XX:+UseConcMarkSweepGC

        Then I build a debug jvm and use gdb to debug the problem.

        call stack

        C [libc.so.6+0x7a9b0] memset+0x40
        V [libjvm.so+0x2b6c42]
         BlockOffsetArray::set_remainder_to_point_to_start_incl(unsigned long,
        unsigned long, bool)+0xce
        V [libjvm.so+0x2b7043]
         BlockOffsetArray::set_remainder_to_point_to_start(HeapWord*, HeapWord*,
        bool)+0x71
        V [libjvm.so+0x2b728d]
         BlockOffsetArray::BlockOffsetArray(BlockOffsetSharedArray*, MemRegion,
        bool)+0x9f
        V [libjvm.so+0x3c089f]
         BlockOffsetArrayNonContigSpace::BlockOffsetArrayNonContigSpace(BlockOffsetSharedArray*,
        MemRegion)+0x37
        V [libjvm.so+0x3be56f]
         CompactibleFreeListSpace::CompactibleFreeListSpace(BlockOffsetSharedArray*,
        MemRegion, bool, FreeBlockDictionary::DictionaryChoice)+0x9b
        V [libjvm.so+0x3fd2e1]
         ConcurrentMarkSweepGeneration::ConcurrentMarkSweepGeneration(ReservedSpace,
        unsigned long, int, CardTableRS*, bool,
        FreeBlockDictionary::DictionaryChoice)+0x1df
        V [libjvm.so+0x4dc03e] GenerationSpec::init(ReservedSpace, int,
        GenRemSet*)+0x37c
        V [libjvm.so+0x4ced40] GenCollectedHeap::initialize()+0x510
        V [libjvm.so+0x7c23c3] Universe::initialize_heap()+0x31d
        V [libjvm.so+0x7c27ec] universe_init()+0xa6
        V [libjvm.so+0x5056e2] init_globals()+0x34
        V [libjvm.so+0x7ac926] Threads::create_vm(JavaVMInitArgs*, bool*)+0x23a
        V [libjvm.so+0x53f3d4] JNI_CreateJavaVM+0x7a

        in function BlockOffsetArray::set_remainder_to_point_to_start_inc, inside
        the for loop:
            size_t reach = start_card - 1 + (power_to_cards_back(i+1) - 1);
        when i = 7, the value of reach was 0. then the loop could not break, and
            _array->set_offset_array(start_card_for_region, reach, offset,
        reducing);
        accessed the wrong address, and crashed.

        the root cause was
        static size_t power_to_cards_back(uint i) {
            return (size_t)(1 << (LogBase * i));
        }
        the literal 1 is a 32bit int, and 1<<32 overflow.


        Here was my fix(has been tested), also found in attached file
        cms_large_heap_crash.patch

        +++ b/src/share/vm/memory/blockOffsetTable.hpp
        @@ -289,7 +289,7 @@
        };

        static size_t power_to_cards_back(uint i) {
        - return (size_t)(1 << (LogBase * i));
        + return (size_t)1 << (LogBase * i);
        }
        static size_t power_to_words_back(uint i) {
        return power_to_cards_back(i) * N_words;

        Contributed-by: Hal Mo <kungu.mjh at taobao.com>

        Similar situation also found in G1, but the size is mega(2^20) based.
        2^(32+20) is too large to overflow.

              brutisso Bengt Rutisson (Inactive)
              brutisso Bengt Rutisson (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: