-
Enhancement
-
Resolution: Unresolved
-
P3
-
9
-
Fix Understood
-
generic
-
generic
It is difficult to create preformatted sections of text using the HTML `<pre>` element and embedded `{@code ...}` or `<code>` tags without encountering unintentional whitespace (whether leading, trailing. or indentation). Although we now have the `{@snippet}` taglet which solves this and other problems, we should try to make it easier for authors to use pre/code tags without fighting whitespace. There are also hundreds of pre/code usages in JDK and other Java code which suffer from these problems and are not likely to be converted to snippets any time soon.
There are two unrelated aspects to the problem:
- Indentation and trailing whitespace caused by javadoc authoring convention of putting a space between the `*` and the comment text at the beginning of a line. This is something we could most easily fix in `DocCommentParser`, since we have the whole comment to work with. However, wholesale removal of leading spaces is not an option as it breaks lookup of source positions for reporting errors and warnings. (This could be prevented by stripping leading space in the tokenizer, but that would change the official `getDocComment(Element)` API). However, we can normalize leading whitespace in the parser in specific contexts such as within `<pre>` elements and `{@code}` tags. This can break calculated code positions *within* these particular doc trees (not that of the trees themselves, so invalid tags/elements will still be reported correctly). It only affects a very particular reporting method[1] that has no practical use for content inside `<pre>` or `{@code}` tree. The spec for this method could be amended with a note on preformatted doc trees.
- Leading empty lines are mostly caused by `<pre>` elements only ignoring an initial newline if it immediately follows the element[2]. Having the `<pre>` element followed by a `<code>` element causes all subsequent linebreaks to be displayed. This can't be fixed in the parser, and it's almost impossible to fix once the comment has been parsed (lots of very specific combinations of StartElementTree, TextTree and LiteralTree are affected). The only practical way of solving this is by using a JavaScript to remove the leading linebreak in the browser. Fortunately this takes just a few lines of script code an runs very smoothly.
[1]: https://docs.oracle.com/en/java/javase/23/docs/api/jdk.javadoc/jdk/javadoc/doclet/Reporter.html#print(javax.tools.Diagnostic.Kind,com.sun.source.util.DocTreePath,int,int,int,java.lang.String)
[2]: https://html.spec.whatwg.org/#the-pre-element:syntax
In both cases, the plan is to be minimally invasive by only stripping one whitespace item: A single leading space per line on traditional doc comments and only if all lines have one (similar to String.stripIndent()), and a single blank line per `<pre><code>` content. So intentional whitespace is still possible by just adding a space or newline.
There are two unrelated aspects to the problem:
- Indentation and trailing whitespace caused by javadoc authoring convention of putting a space between the `*` and the comment text at the beginning of a line. This is something we could most easily fix in `DocCommentParser`, since we have the whole comment to work with. However, wholesale removal of leading spaces is not an option as it breaks lookup of source positions for reporting errors and warnings. (This could be prevented by stripping leading space in the tokenizer, but that would change the official `getDocComment(Element)` API). However, we can normalize leading whitespace in the parser in specific contexts such as within `<pre>` elements and `{@code}` tags. This can break calculated code positions *within* these particular doc trees (not that of the trees themselves, so invalid tags/elements will still be reported correctly). It only affects a very particular reporting method[1] that has no practical use for content inside `<pre>` or `{@code}` tree. The spec for this method could be amended with a note on preformatted doc trees.
- Leading empty lines are mostly caused by `<pre>` elements only ignoring an initial newline if it immediately follows the element[2]. Having the `<pre>` element followed by a `<code>` element causes all subsequent linebreaks to be displayed. This can't be fixed in the parser, and it's almost impossible to fix once the comment has been parsed (lots of very specific combinations of StartElementTree, TextTree and LiteralTree are affected). The only practical way of solving this is by using a JavaScript to remove the leading linebreak in the browser. Fortunately this takes just a few lines of script code an runs very smoothly.
[1]: https://docs.oracle.com/en/java/javase/23/docs/api/jdk.javadoc/jdk/javadoc/doclet/Reporter.html#print(javax.tools.Diagnostic.Kind,com.sun.source.util.DocTreePath,int,int,int,java.lang.String)
[2]: https://html.spec.whatwg.org/#the-pre-element:syntax
In both cases, the plan is to be minimally invasive by only stripping one whitespace item: A single leading space per line on traditional doc comments and only if all lines have one (similar to String.stripIndent()), and a single blank line per `<pre><code>` content. So intentional whitespace is still possible by just adding a space or newline.
- csr for
-
JDK-8350428 Improve whitespace normalization in preformatted text
-
- Closed
-
- relates to
-
JDK-8340819 Improve traditional documentation comment parsing rules
-
- Closed
-
- links to
-
Review(master) openjdk/jdk/23868