Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8211926

Catastrophic size_t underflow in BitMap::*_large methods

XMLWordPrintable

    • gc
    • b20

        The easiest way to demonstrate it now is to make GCs go via "large" bitmap methods for clearing:

        diff -r f697ba5b18d2 src/hotspot/share/gc/shared/markBitMap.cpp
        --- a/src/hotspot/share/gc/shared/markBitMap.cpp Mon Oct 08 13:25:39 2018 +0800
        +++ b/src/hotspot/share/gc/shared/markBitMap.cpp Tue Oct 09 14:08:15 2018 +0200
        @@ -51,6 +51,6 @@
                  p2i(mr.start()), p2i(mr.end()));
           // convert address range into offset range
        - _bm.at_put_range(addr_to_offset(intersection.start()),
        - addr_to_offset(intersection.end()), false);
        + _bm.clear_large_range(addr_to_offset(intersection.start()),
        + addr_to_offset(intersection.end()));
         }
         
        diff -r f697ba5b18d2 src/hotspot/share/utilities/bitMap.inline.hpp
        --- a/src/hotspot/share/utilities/bitMap.inline.hpp Mon Oct 08 13:25:39 2018 +0800
        +++ b/src/hotspot/share/utilities/bitMap.inline.hpp Tue Oct 09 14:08:15 2018 +0200
        @@ -305,8 +305,10 @@
         
         inline void BitMap::set_large_range_of_words(idx_t beg, idx_t end) {
        + assert(beg <= end, "underflow");
           memset(_map + beg, ~(unsigned char)0, (end - beg) * sizeof(bm_word_t));
         }
         
         inline void BitMap::clear_large_range_of_words(idx_t beg, idx_t end) {
        + assert(beg <= end, "underflow");
           memset(_map + beg, 0, (end - beg) * sizeof(bm_word_t));
         }


        Then tier1_gc would fail with lots of failures like this:

        # Internal Error (/home/shade/trunks/jdk-jdk/src/hotspot/share/utilities/bitMap.inline.hpp:312), pid=6727, tid=6753
        # assert(beg <= end) failed: underflow
        #
        V [libjvm.so+0x18e0e1f] VMError::report_and_die(Thread*, void*, char const*, int, char const*, char const*, __va_list_tag*)+0x2f
        V [libjvm.so+0xaf60fa] report_vm_error(char const*, int, char const*, char const*, ...)+0x12a
        V [libjvm.so+0x663e70] BitMap::clear_large_range(unsigned long, unsigned long)+0x160
        V [libjvm.so+0x134d8e2] MarkBitMap::clear_range(MemRegion)+0x72
        V [libjvm.so+0xc96ee6] G1ConcurrentMark::clear_range_in_prev_bitmap(MemRegion)+0x36
        V [libjvm.so+0xcaf2de] RemoveSelfForwardPtrObjClosure::zap_dead_objects(HeapWord*, HeapWord*)+0x8e
        V [libjvm.so+0xcaf9e3] RemoveSelfForwardPtrObjClosure::do_object(oop)+0x163
        V [libjvm.so+0xddf038] G1ContiguousSpace::object_iterate(ObjectClosure*)+0x48
        V [libjvm.so+0xcaf7f0] RemoveSelfForwardPtrHRClosure::do_heap_region(HeapRegion*)+0x1f0
        V [libjvm.so+0xc91a6b] G1CollectionSet::iterate_from(HeapRegionClosure*, unsigned int, unsigned int) const+0x5b
        V [libjvm.so+0xcae1af] G1ParRemoveSelfForwardPtrsTask::work(unsigned int)+0xaf
        V [libjvm.so+0x19604d8] GangWorker::loop()+0xe8
        V [libjvm.so+0x14d7130] thread_native_entry(Thread*)+0x100

        The underlying reason is that "large" methods have this code:

          idx_t beg_full_word = word_index_round_up(beg);
          idx_t end_full_word = word_index(end);

          assert(end_full_word - beg_full_word >= 32,
                 "the range must include at least 32 bytes");
          ...
          set_large_range_of_words(beg_full_word, end_full_word);

        ...or this code:

          idx_t beg_full_word = word_index_round_up(beg);
          idx_t end_full_word = word_index(end);

          if (end_full_word - beg_full_word < 32) {
            clear_range(beg, end);
            return;
          }
          ...
          clear_large_range_of_words(beg_full_word, end_full_word);

        Which looks innocuous, until you see that idx_t is the alias for the unsigned size_t, and neither assert nor the check really protect us from (beg > end) that would underflow the subtraction over to large positive value. And this would happen eventually on close beg/end, because beg is getting rounded up. This ends up in {clear|set}_large_range_of_words, which calls memset, which uses underflow-ed value and runs through memory, wrecking havoc and finally terminating with SEGV. Unless we put the assert that prevents it, like above.

        This had not blown up in most testing today, because "large" methods are not used frequently, or they are used when memory region for for "large" clears/sets is quite big to avoid underflow.

              shade Aleksey Shipilev
              shade Aleksey Shipilev
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: