Uploaded image for project: 'Skara'
  1. Skara
  2. SKARA-2057

Race in BotRunner

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • None
    • None
    • bots
    • None

      In my recent PR https://github.com/openjdk/skara/pull/1567, a test failed randomly in GHA: BotRunnerTests::dependentItems. When investigating this test, running it locally in my IDE, I managed to reproduce the failure once. This lead me to believe there is a race somewhere in BotRunner which is cause this test to sometimes fail.

      I believe I have found it. The drain() method can sometimes return before all WorkItems have been processed. This can happen because of how the pending and active maps are updated in RunnableWorkItem.runMeasured(). When a WorkItem is done, it is taken off the active set. This happens in the finally clause in runMeasured(), in its own synchronized block. Followup WorkItems are then scheduled further down in the method, in a separate synchronized block. This makes it possible for drain() to encounter both the active and pending maps as empty, when there are actually still followup WorkItems in the process of being scheduled.

      To fix this, we need to reorder operations in runMeasured.

      As far as I can tell, this won't affect bots in normal operation. Calling drain() is only done when running a bot in 'single' mode which is mostly done for testing.

            Unassigned Unassigned
            erikj Erik Joelsson
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated: