-
Enhancement
-
Resolution: Won't Fix
-
P3
-
None
-
None
-
None
-
generic
-
generic
The latest version of the JavaDoc Documentation Comment Specification[1] specifies the parsing of traditional documentation comments as follows:
"Traditional documentation comments are traditional comments that begin with /**. If any line in such a comment begins with asterisks after any leading whitespace, the leading whitespace and asterisks are removed. Any whitespace appearing after the asterisks is not removed."
[1]: https://docs.oracle.com/en/java/javase/23/docs/specs/javadoc/doc-comment-spec.html
The fact that a space character following one or more leading asterisks within a traditional doc comment line is not removed and included with the parsed comment is at odds with the vast majority of JavaDoc doc comments, which almost always include a leading asterisk followed by a single space character which is not intended to be part of the doc comment line.
Of course this mismatch is possible because of the way whitespace is handled in HTML/CSS, where in the default inline formatting context multiple whitespace characters including line breaks are collapsed into a single space character.[2][3]
[2]: https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Whitespace
[3]: https://www.w3.org/TR/css-text-3/#white-space-processing
However, it creates problems when whitespace is preserved, such as for text contained within a `<pre>` element. In this case, the space following the '*' character becomes visible. Most developers are not aware of this, which is why even in OpenJDK source a large number of code samples within `<pre>` tags include an empty trailing line containing a single space character. As an example, see the code samples in `java.lang.String`[4] (also see attached screenshots).
[4]: https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/String.html
While fear of breaking existing documentation may have prevented JavaDoc developers from changing this aspect of doc comment parsing, there is surprisingly little cause for this.
- Because of the way HTML handles whitespace (see links above), a removed space character at the beginning of a line (i.e. following a line break) does not affect the layout.
- In many uses of preformatted text, a single space character will be removed from each line, which will not affect the relative layout of lines. This is true for the code samples in `java.lang.String` above and many other uses of the `<pre>` element in JDK doc comments.
One of the very few cases where removal of a space character at the beginning of a line can be observed is with combined `<pre><code>` elements, because in this context the initial line break is preserved, leading to a change in relative indentation in the lines following the opening tags.
I think that the compatibility risks are minor, and that we should at least consider changing the doc comment parsing rules to remove a single space following the leading asterisks.
"Traditional documentation comments are traditional comments that begin with /**. If any line in such a comment begins with asterisks after any leading whitespace, the leading whitespace and asterisks are removed. Any whitespace appearing after the asterisks is not removed."
[1]: https://docs.oracle.com/en/java/javase/23/docs/specs/javadoc/doc-comment-spec.html
The fact that a space character following one or more leading asterisks within a traditional doc comment line is not removed and included with the parsed comment is at odds with the vast majority of JavaDoc doc comments, which almost always include a leading asterisk followed by a single space character which is not intended to be part of the doc comment line.
Of course this mismatch is possible because of the way whitespace is handled in HTML/CSS, where in the default inline formatting context multiple whitespace characters including line breaks are collapsed into a single space character.[2][3]
[2]: https://developer.mozilla.org/en-US/docs/Web/API/Document_Object_Model/Whitespace
[3]: https://www.w3.org/TR/css-text-3/#white-space-processing
However, it creates problems when whitespace is preserved, such as for text contained within a `<pre>` element. In this case, the space following the '*' character becomes visible. Most developers are not aware of this, which is why even in OpenJDK source a large number of code samples within `<pre>` tags include an empty trailing line containing a single space character. As an example, see the code samples in `java.lang.String`[4] (also see attached screenshots).
[4]: https://docs.oracle.com/en/java/javase/23/docs/api/java.base/java/lang/String.html
While fear of breaking existing documentation may have prevented JavaDoc developers from changing this aspect of doc comment parsing, there is surprisingly little cause for this.
- Because of the way HTML handles whitespace (see links above), a removed space character at the beginning of a line (i.e. following a line break) does not affect the layout.
- In many uses of preformatted text, a single space character will be removed from each line, which will not affect the relative layout of lines. This is true for the code samples in `java.lang.String` above and many other uses of the `<pre>` element in JDK doc comments.
One of the very few cases where removal of a space character at the beginning of a line can be observed is with combined `<pre><code>` elements, because in this context the initial line break is preserved, leading to a change in relative indentation in the lines following the opening tags.
I think that the compatibility risks are minor, and that we should at least consider changing the doc comment parsing rules to remove a single space following the leading asterisks.
- relates to
-
JDK-8346118 Improve whitespace normalization in preformatted text
-
- In Progress
-