-
Enhancement
-
Resolution: Unresolved
-
P4
-
None
-
None
-
None
The JDK source code should normally be in ASCII. There are legitimate reasons for including characters outside the ASCII range, but they should be done intentionally and only when really needed. See JDK-8354213 for an example of how e.g. cyrillic letters in identifiers, non-breaking space in comments etc have crept into the source code.
Just as jcheck adds a warning when a developer wants to add a large binary file (which is allowed but you should have good reasons for it), so I believe jcheck should warn when a non-ASCII character is added in a PR. It is imperative that this is not a blocker, just a signal to double check that this is indeed what is wanted. This also gives the developer an extra chance to verify that the non-ASCII character is indeed in UTF-8 encoding.
Some changes (e.g. to i18n code) might include a large number of UTF-8 character changes, so we must make sure not to spam the PR with warnings if there are too many non-ASCII characters added in a single PR.
Just as jcheck adds a warning when a developer wants to add a large binary file (which is allowed but you should have good reasons for it), so I believe jcheck should warn when a non-ASCII character is added in a PR. It is imperative that this is not a blocker, just a signal to double check that this is indeed what is wanted. This also gives the developer an extra chance to verify that the non-ASCII character is indeed in UTF-8 encoding.
Some changes (e.g. to i18n code) might include a large number of UTF-8 character changes, so we must make sure not to spam the PR with warnings if there are too many non-ASCII characters added in a single PR.