Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4228209

race condition in green threads scheduler in 1.1.7?

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: P4 P4
    • None
    • 1.1.7
    • hotspot
    • sparc
    • solaris_2.5.1



      Name: clC74495 Date: 04/09/99


      I believe a customer application has revealed a bug in the green threads
      implementation in 1.1.7, and I think in several earlier versions as well.
      I'm running HP's VM with green threads on hp-ux 10.20, but I've read the
      solaris code carefully and I think the bug will happen there as well. There
      appears to be a race condition involving the processPendingNotification()
      function in green_threads/src/signals.c when called from fullSwitchContext() and
      the handling of pending notification when a signal arrives in
      asyncEventNotify().

      Here's the code for processPendingNotification(), fullSwitchContext() and
      asyncEventNotify(), which are the three functions which are concerned. I don't
      think the call to intrLock() should have been commented out in
      fullSwitchContext. If interrupts are enabled when processPendingNotification is
      called, it could happen that a signal arrives just as this function is setting
      PendingNotifyQ = SYS_MID_NULL. When the signal arrives, the asyncEventNotify
      function puts the monitor on the pending notify queue since the SCHED_LOCK is
      currently held. There is the opportunity to put it on the queue just before it's
      set to NULL. So that monitor's flags are set as if it's on the queue, but it's
      really not. So the next time that signal arrives when the SCHED_LOCK is held,
      the flags are wrong, and the notification for that signal is not handled.

      The application that shows this problem often runs for two days or more
      before this exact situation occurs, but when it does, the application
      hangs because the clock thread never gets notification when a SIGALRM
      happens.

      static int
      processPendingNotification()
      {
          sys_mon_t *mid;
          sys_mon_t *midq;
          int need_to_reschedule = 0;

          /*
           * Perform the notifyall's that couldn't be done because of locking.
           * Pull all of the pending CV notifies off of the queue, and process
           * them. The notification is completely normal, except for the locking.
           * If a yield is necessary, it is done by the caller.
           */
          sysAssert(PendingNotifyQ);
          midq = PendingNotifyQ;
          while (midq) {
      mid = midq;
      midq = midq->pendingq;
      mid->pendingq = SYS_MID_NULL;
      mid->flags &= ~SYS_MON_PENDING_NOTIFICATION;

      /* Now notify all of the waiters */
      need_to_reschedule |= interruptBroadcast(mid);
          }
          PendingNotifyQ = SYS_MID_NULL;
          return need_to_reschedule;
      }

      void
      fullSwitchContext(context_t *c)
      {
          /*
           * This thread was interrupted, which means that it was NOT inside
           * a critical section. This also means that the thread does not know
           * that it needs to clear the critical section lock when control
           * returns to it. We have to clear the lock for it, and there may
           * be pending interrupts.
           */

          /*
           * If the scheduler was interrupted, we process the events, and call
           * it with interrupts disabled. This prevents recursion.
           */
          /* intrLock(); - removed by csw */
          if (PendingNotifyQ) {
      if (processPendingNotification()) {
      sys_thread_t *self = greenThreadSelf();

      if (self->state == RUNNABLE) {
      queueInsert(&runnable_queue, self);
      }
      reschedule();
      }
          }
          /* intrUnlock(); - removed by csw */

          /*
           * Don't have to call return_from_trap instead of switch_context
           * if preemption is set because we are continuing the thread that
           * was interrupted when the signal came in and the signal trampoline
           * will restore the original signal mask.
           */
          greenThreadSelf()->full_switch = 0;
          switchContext(c);
          /* NOTREACHED */
      }


      static int
      asyncEventNotify(sys_mon_t *mid)
      {
          int need_to_switch = 0;

          sysAssert(mid!=NULL);
          if (SCHED_LOCKED()) {
      /*
      * Queue this notification
      */
      if ((mid->flags & SYS_MON_PENDING_NOTIFICATION) == 0) {
      sysAssert(mid->pendingq == SYS_MID_NULL);
      mid->pendingq = PendingNotifyQ;
      PendingNotifyQ = mid;
      }
      mid->flags |= SYS_MON_PENDING_NOTIFICATION;
          } else {
      /* Actually handle the notification */
      need_to_switch = interruptBroadcast(mid);
          }
          return need_to_switch;
      }
      (Review ID: 56736)
      ======================================================================

            Unassigned Unassigned
            clucasius Carlos Lucasius (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: