Summary
Introduce a new UL output option, foldmultilines
, to replace newline characters within a multiline log event with the character sequence '\' and 'n'
Problem
Most of the UL entries would print each log message as a single line. However, in some cases (e.g. exceptions
), a single log message is printed as multiple lines:
[0.157s][info][exceptions] Exception <a 'java/lang/NullPointerException'{0x000000008b918f70}: test>
thrown in interpreter method <{method} {0x00007f8335000248} 'main' '([Ljava/lang/String;)V' in 'Test'>
at bci 9 for thread 0x00007f8330017160 (main)
It is easier to parse with log shippers (Fluent Bit, Logstash, etc) if each log message in UL is printed as a single line.
Famous log shippers support multiline logs, but its configuration tends to be complex, and also some input plugins (e.g. TCP on Fluent Bit) do not support multiline logs.
Solution
Introduce new boolean UL option foldmultilines
to escape newline (\n
: 0x0a) and backslash (\
: 0x5c) characters in the UL output. It is set false
by default.
When writing a character buffer into the output file, we treat the buffer as a stream of (8-bit) bytes and handle each byte individually:
- If the byte is the newline character (0x0a), the following two bytes will be written into the output file: 0x5c (backslash), 0x6e (lowercase 'n')
- If the byte is the backslash character (0x5c), the following two bytes will be written into the output file: 0x5c (backslash), 0x5c (backslash)
- Otherwise the byte is written as is into the output file
Note that foldmultilines=true
should be used only with compatible character encodings. For example, this proposal may inadvertently convert 0x0a and 0x5c in Shift JIS and BIG5 because they can have 0x0a and/or 0x5c in multi-byte sequences. This proposal is compatible with UTF-8, where no multi-byte sequences will contain them.
foldmultilines
does not affect newline between log events. Each log event will be terminated by exactly one newline character.
Log message in C
char *message = "line 1\nline \\2\nline 3";
foldmultilines=false
line 1
line \2
line 3
foldmultilines=true
line 1\nline \\2\nline 3
The original text with newline characters can be restored by replacing each pair of '\' characters with a single '\', and replacing each sequence "\n" with the actual newline character, whilst ensuring we scan to the next '\' after performing any such replacement. This is demonstrated by the following Java code:
String decoded = "";
int fromIndex = 0;
int found = 0;
// Find backslash from the log entry.
while ((found = ENCODED_LOGLINE.indexOf('\\', fromIndex)) != -1) {
if (found == ENCODED_LOGLINE.length() - 1) {
throw new RuntimeException("Backslash shouldn't be located at the tail.");
}
decoded += ENCODED_LOGLINE.substring(fromIndex, found);
char next = ENCODED_LOGLINE.charAt(found + 1);
if (next == '\\') { // next char is backslash ("\\\\" in Java)
decoded += '\\'; // Treat as single backslash
} else if (next == 'n') { // next char is "n" ("\\n" in Java)
decoded += '\n'; // Treat as newline
}
fromIndex = found + 2; // Forward index to next char
}
decoded += ENCODED_LOGLINE.substring(fromIndex); // Add remaining chars to decoded string
Specification
description for output options in man page of java
output-options is
filecount=file-count filesize=<file size with optional K, M or G suffix> foldmultilines=<true|false>
command line
$ java -Xlog:exceptions=info:file=npe.log:::foldmultilines=true Test
npe.log
[0.166s][info][exceptions] Exception <a 'java/lang/NullPointerException'{0x000000008b918f70}: test>\n thrown in interpreter method <{method} {0x00007fbc81000248} 'main' '([Ljava/lang/String;)V' in 'Test'>\n at bci 9 for thread 0x00007fbc9c0171a0 (main)
- csr of
-
JDK-8271186 Add UL option to replace newline char
-
- Resolved
-