Today, asynchronous UL depends on off-loading the work of logging to output devices by having log sites copying messages onto a buffer and an output thread swapping it with an empty buffer and emptying it.
If the current buffer runs out of room, then any attempt at logging leads to the message being dropped, that is: The message is not printed at all.
I'd like to suggest that we add a stalling mode to async UL where in no message dropping occurs and all log sites stall until there is sufficient room in the buffer for their respective message.
I imagine that the user facing change for async UL would be to equip -Xlog:async with an option, as such:
```
$ java -Xlog:async:drop ... # Drop messages
$ java -Xlog:async:stall ... # Stall log sites
$ java -Xlog:async ... # Drop messages by default
```
I have described the current async protocol which is in mainline in appendix 0. There are two issues with this protocol:
1. During times when there is a high amount of logging in the system there is also high contention on the mutex. This can lead the output thread to suffer from starvation, as it cannot take the mutex to swap the buffers.
This in turn leads to excessive dropping of messages and poor scaling of buffer increases.
2. Implementing stalling (as opposed to dropping messages) under this protocol leads to unnecessary waiting on the logsites.
Stalling would mean that a log site has to wait for the entire log buffer to be emptied before logging.
This may mean 1MiB of data having to be output before the swap can occur, and of course the swap requires the writing thread to acquire the mutex which has its own issues.
Issue number 1 can most easily be observed by inducing a (perhaps artificially so) high amount of logging through setting logging to all=trace in a slow-debug build:
```
$ slow-debug/jdk/bin/java -Xlog:async -Xlog:all=trace:file=asynctest::filecount=0
$ grep 'dropped due to' asynctest
[0.147s][warning][ ] 1338 messages dropped due to async logging
[0.171s][warning][ ] 496 messages dropped due to async logging
[0.511s][warning][ ] 98 messages dropped due to async logging
[0.542s][warning][ ] 13 messages dropped due to async logging
[0.762s][warning][ ] 5620 messages dropped due to async logging
[1.286s][warning][ ] 12440 messages dropped due to async logging
[1.316s][warning][ ] 2610 messages dropped due to async logging
[1.352s][warning][ ] 2497 messages dropped due to async logging
```
I'd like to suggest that we re-implement async logging using a producer/consumer protocol backed by circular buffers instead. This means that the output thread no longer suffers from starvation and can continously make progress.
It also means that we don't have to split up the buffer memory in two, both log threads and output thread can share the whole buffer at the same time. The new protocol is a basic producer/consumer pattern with two mutexes and a circular buffer with two variables: head and tail. The consumer reads tail and sets head, the producer reads head and writes to tail. In this case the producer are the log sites and the consumer is the output thread. See appendix 1 for a pseudo-code description of the protocol. There are some details ommitted, such as how stalling is implemented (hint: wait/notify on the producer lock), please read the code for details.
Appendix 0, the old async protocol.
```
Variables:
Mutex m;
Buffer* cb;
Buffer a;
Buffer b;
const char* message;
Methods:
void Buffer::write(const char* message) {
if (has_space(strlen(message))) {
internal_write(message);
}
}
LogSite:
1. m.lock();
2. cb->write(message);
3. m.unlock();
4. m.notify();
OutputThread:
1. m.wait();
2. m.lock();
3. buffer := cb;
4. if cb == &a then cb = &b; else cb = &a;
5. m.unlock();
6. for each message in buffer: print to log site
7. go to 1.
```
Appendix 1, the new protocol.
Nota bene: Any absence or presence of bugs may be the opposite in the actual source code.
```
Mutex consumer;
Mutex producer;
CircularBuffer cb;
size_t head;
size_t tail;
const char* message;
char* out;
LogSite:
1. producer.lock();
2. t = tail; h = head; // Read the values
2. if unused_memory(h, t) < strlen(message) then drop_or_stall();
3. size_t bytes_written = cb.write(message); // Writes string and envelope
4. tail = (t + bytes_written) % cb.size(); // Message ready to consume, move tail.
5. producer.unlock();
OutputThread:
1. consumer.lock();
2. t = tail; h = head;
3. if h == t return; // No message available
4. size_t bytes_read = cb.read_bytes(out, h) // Write the next message
5. head = (h + bytes_read) % cb.size() // More memory available for producer
6. consumer.unlock();
```
If the current buffer runs out of room, then any attempt at logging leads to the message being dropped, that is: The message is not printed at all.
I'd like to suggest that we add a stalling mode to async UL where in no message dropping occurs and all log sites stall until there is sufficient room in the buffer for their respective message.
I imagine that the user facing change for async UL would be to equip -Xlog:async with an option, as such:
```
$ java -Xlog:async:drop ... # Drop messages
$ java -Xlog:async:stall ... # Stall log sites
$ java -Xlog:async ... # Drop messages by default
```
I have described the current async protocol which is in mainline in appendix 0. There are two issues with this protocol:
1. During times when there is a high amount of logging in the system there is also high contention on the mutex. This can lead the output thread to suffer from starvation, as it cannot take the mutex to swap the buffers.
This in turn leads to excessive dropping of messages and poor scaling of buffer increases.
2. Implementing stalling (as opposed to dropping messages) under this protocol leads to unnecessary waiting on the logsites.
Stalling would mean that a log site has to wait for the entire log buffer to be emptied before logging.
This may mean 1MiB of data having to be output before the swap can occur, and of course the swap requires the writing thread to acquire the mutex which has its own issues.
Issue number 1 can most easily be observed by inducing a (perhaps artificially so) high amount of logging through setting logging to all=trace in a slow-debug build:
```
$ slow-debug/jdk/bin/java -Xlog:async -Xlog:all=trace:file=asynctest::filecount=0
$ grep 'dropped due to' asynctest
[0.147s][warning][ ] 1338 messages dropped due to async logging
[0.171s][warning][ ] 496 messages dropped due to async logging
[0.511s][warning][ ] 98 messages dropped due to async logging
[0.542s][warning][ ] 13 messages dropped due to async logging
[0.762s][warning][ ] 5620 messages dropped due to async logging
[1.286s][warning][ ] 12440 messages dropped due to async logging
[1.316s][warning][ ] 2610 messages dropped due to async logging
[1.352s][warning][ ] 2497 messages dropped due to async logging
```
I'd like to suggest that we re-implement async logging using a producer/consumer protocol backed by circular buffers instead. This means that the output thread no longer suffers from starvation and can continously make progress.
It also means that we don't have to split up the buffer memory in two, both log threads and output thread can share the whole buffer at the same time. The new protocol is a basic producer/consumer pattern with two mutexes and a circular buffer with two variables: head and tail. The consumer reads tail and sets head, the producer reads head and writes to tail. In this case the producer are the log sites and the consumer is the output thread. See appendix 1 for a pseudo-code description of the protocol. There are some details ommitted, such as how stalling is implemented (hint: wait/notify on the producer lock), please read the code for details.
Appendix 0, the old async protocol.
```
Variables:
Mutex m;
Buffer* cb;
Buffer a;
Buffer b;
const char* message;
Methods:
void Buffer::write(const char* message) {
if (has_space(strlen(message))) {
internal_write(message);
}
}
LogSite:
1. m.lock();
2. cb->write(message);
3. m.unlock();
4. m.notify();
OutputThread:
1. m.wait();
2. m.lock();
3. buffer := cb;
4. if cb == &a then cb = &b; else cb = &a;
5. m.unlock();
6. for each message in buffer: print to log site
7. go to 1.
```
Appendix 1, the new protocol.
Nota bene: Any absence or presence of bugs may be the opposite in the actual source code.
```
Mutex consumer;
Mutex producer;
CircularBuffer cb;
size_t head;
size_t tail;
const char* message;
char* out;
LogSite:
1. producer.lock();
2. t = tail; h = head; // Read the values
2. if unused_memory(h, t) < strlen(message) then drop_or_stall();
3. size_t bytes_written = cb.write(message); // Writes string and envelope
4. tail = (t + bytes_written) % cb.size(); // Message ready to consume, move tail.
5. producer.unlock();
OutputThread:
1. consumer.lock();
2. t = tail; h = head;
3. if h == t return; // No message available
4. size_t bytes_read = cb.read_bytes(out, h) // Write the next message
5. head = (h + bytes_read) % cb.size() // More memory available for producer
6. consumer.unlock();
```
- links to
-
Review(master) openjdk/jdk/17757