Shenandoah: Mutator may block at _gc_waiters_lock after allocation failure even block parameter is false

XMLWordPrintable

    • gc

      While I am working on a PoC to use WaitBarrier instead of mutex/monitor to block mutators when there are allocation failures, I noticed that Shenandoah and Genshen handles allocation failure differently in terms of how mutators are blocked.

      ```
      void ShenandoahController::handle_alloc_failure(const ShenandoahAllocRequest& req, bool block) {
        assert(current()->is_Java_thread(), "expect Java thread here");

        const bool is_humongous = ShenandoahHeapRegion::requires_humongous(req.size());
        const GCCause::Cause cause = is_humongous ? GCCause::_shenandoah_humongous_allocation_failure : GCCause::_allocation_failure;

        ShenandoahHeap* const heap = ShenandoahHeap::heap();
        if (heap->cancel_gc(cause)) {
          log_info(gc)("Failed to allocate %s, " PROPERFMT, req.type_string(), PROPERFMTARGS(req.size() * HeapWordSize));
          request_gc(cause);
        }

        if (block) {
          MonitorLocker ml(&_alloc_failure_waiters_lock);
          while (!should_terminate() && ShenandoahCollectorPolicy::is_allocation_failure(heap->cancelled_cause())) {
            ml.wait();
          }
        }
      }
      ```

      In ShenandoahControlThread, request_gc method notifies controller thread and wait at _gc_waiters_lock until one GC cycle to finish, even for allocation failure.

      ```
      void ShenandoahControlThread::request_gc(GCCause::Cause cause) {
        if (ShenandoahCollectorPolicy::should_handle_requested_gc(cause)) {
          handle_requested_gc(cause);
        }
      }
      ```

      While ShenandoahGenerationalControlThread has different impl, it only notify control thread if the cause is allocation failure, in such case it won't wait for one GC cycle to finish.
       
      ```
      void ShenandoahGenerationalControlThread::request_gc(GCCause::Cause cause) {
        if (ShenandoahCollectorPolicy::is_allocation_failure(cause)) {
          // GC should already be cancelled. Here we are just notifying the control thread to
          // wake up and handle the cancellation request, so we don't need to set _requested_gc_cause.
          notify_cancellation(cause);
        } else if (ShenandoahCollectorPolicy::should_handle_requested_gc(cause)) {
          handle_requested_gc(cause);
        }
      }
      ```

      The Shenandoah impl seems problematic:
      1. The first mutator thread calling into ShenandoahController::handle_alloc_failure will always be bloced in `request_gc(cause)` to wait for one GC cycle to finish, no matter the parameter `block` is true or false;
      2. Subsequent mutator threads calling is more likely to be blocked at _alloc_failure_waiters_lock if `block` is true;

      We should unify the behavior of ShenandoahController::handle_alloc_failure for Shenandoah and Genshen.

            Assignee:
            Xiaolong Peng
            Reporter:
            Xiaolong Peng
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: