One of our tests is crashing inside the pthread library when calling pthread_mutex_unlock. The crash log in ~/Library/Logs/DiagnosticReports/ shows the message:

libpthread/libpthread-301.50.1/src/pthread_mutex.c:_pthread_mutex_unlock_drop:939: __psynch_mutexdrop failed with error 22

The system log shows the following error message several times when the test fails:

PSYNCH: pid[727]: address already known to kernel for another [busy] synchronizer type 

So, the kernel is confusing the mutex with some other type of synchronization object in method ksyn_wqfind().

Additional logging added into ksyn_wqfind() to print the values of the ksyn_wait_queue_t found by ksyn_wq_hash_lookup() associated with that mutex shows the found kwq is of type KSYN_WQTYPE_CVAR and has the following values:

kwq->kw_pre_intrcount=0
kwq->kw_pre_rwwc=0
kwq->kw_iocount=0
kwq->kw_inqueue=1
kwq->kw_fakecount=1

This means the address in user space where the mutex was allocated used to belong to a condition variable that left behind some stale ksyn_waitq_element_t in its associated waiting queue, and so the mapping in the hashtable was never removed.

To give you more context, the test that we are running and that is crashing has the following behaviour:
1-  Creates a pthread_mutex_t and a pthread_cond_t
2-  A pool of worker threads use the previously created synchronization objects (pthread_cond_timedwait() is used when waiting on the condition variable and pthread_cond_broadcast() is used when signaling)
3- Once the pool of workers no longer use the pthread_mutex_t and pthread_cond_t, both are destroyed.
4- Go back to step 1

We have verified that the failing mutex is indeed allocated in an address that used to belong to a condition variable.


## Kernel code inspection

The following block of code in function ksyn_handle_cvbroad() is creating the stale queue entry:

  if (diff_genseq(ckwq->kw_lword, ckwq->kw_sword)) {
    newkwe = TAILQ_FIRST(&kfreeq.ksynq_kwelist);
    if (newkwe == NULL) {
      ksyn_wqunlock(ckwq);
      newkwe = (ksyn_waitq_element_t)pthread_kern->zalloc(kwe_zone);
      goto retry;
    } else {
      TAILQ_REMOVE(&kfreeq.ksynq_kwelist, newkwe, kwe_list);
      ksyn_prepost(ckwq, newkwe, KWE_THREAD_BROADCAST, upto);
    }
  }

Further inspection into the kernel component of the pthread library indicates __psynch_cvsignal() is racing with psynch_cvcontinue(). If the timedout thread is able to grab the lock on ckwq first, a stale entry of type KWE_THREAD_BROADCAST will be left behind by the previous block of code in the queue associated with that condition variable. Trace:

1-Waiting thread timesout, enters psynch_cvcontinue() and grabs lock on ckwq. Removes itself from the queue and clears kw_lword, kw_uword and kw_sword. 
From here there are two scenarios, based on whether timedout thread succesfully releases the waiting queue in ksyn_wqrelease() before the broadcasting thread got a reference to it.
(I) Broadcasting thread got reference to queue before removal by timedout thread
2-Broadcasting thread grabs lock on kwq in __psynch_cvsignal(), resets values of kw_lword, kw_uword and kw_sword in UPDATE_CVKWQ() and adds stale entry in ksyn_handle_cvbroad(). Executing ksyn_cvupdate_fixup() doesn't clean up the stale entry since kw_sword and kw_lword do not match. Broadcasting thread returns to userspace leaving behind the stale entry.
(II) Timedout thread released the original waiting queue and broadcasting thread got reference to a new one
2-Broadcasting thread executing __psynch_cvsignal() gets a reference to a newly created queue in ksyn_wqfind() with the values of kw_lword, kw_uword and kw_sword resetted. 
3-Broadcasting thread grabs lock on kwq in __psynch_cvsignal() and adds stale entry in ksyn_handle_cvbroad(). Executing ksyn_cvupdate_fixup() doesn't clean up the stale entry since kw_sword and kw_lword do not match. Broadcasting thread returns to userspace leaving behind the stale entry.

Attached is a C test that reproduces the failure. In the test, the main thread initializes a pthread_mutex_t and a pthread_cond_t in dynamically allocated blocks of memory. The main thread creates 2 worker threads which repeatedly execute two phases separated by a reinitialization phase performed by the main thread.
Phase 1: One thread executes pthread_cond_timedwait() with a timeout value of 1 second. Second thread sleeps for 1 second and issues a pthread_cond_broadcast(). 
Reinitialization phase: After worker threads are done with Phase1, main thread destroys the pthread_mutex_t and pthread_cond_t. Reinitializes pthread_mutex_t in the block of memory where the pthread_cond_t resided before destruction. Reinitializes pthread_cond_t in the block of memory where the pthread_mutex_t resided before destruction.
Phase 2: Both threads contend on the mutex forcing a call to __psynch_mutexdrop().


Environment:
macOS High Sierra 10.13.4-10.13.6
xnu-4570.51.2~1/RELEASE_X86_64 x86_64
/usr/lib/system/libsystem_pthread.dylib (current version 301.50.1)