Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P3
Fix Version/s: 0.9
Affects Version/s: 0.9
Component/s: bots
Labels:
- skara-outage

Today we had an outage where the disk that some of the bots were using for data stopped working. This caused everything to grind to a halt blocked on IO, except for the health status endpoint, which happily continued serving 200 results. After a while, the watchdog hit its timeout and called System.exit(1), which made no difference as the JVM process couldn't go down.

I want to change the health status endpoint so that when the watchdog hits, it also flips the health status to unhealthy. This will make us react faster to this situation next time.

links to

Commit openjdk/skara/bcf27471

Review openjdk/skara/1163

Assignee:: Erik Joelsson

Reporter:: Erik Joelsson

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2021-05-21 10:35

Updated:: 2021-05-24 06:30

Resolved:: 2021-05-24 06:30

Details

Description

Attachments

Issue Links

Activity

People

Dates