-
Enhancement
-
Resolution: Unresolved
-
P3
-
21
Currently, the source code in the JDK is in an ill-defined encoding. There is no official declaration of the encoding used. It is "mostly ASCII", but the relatively few non-ASCII characters used are not well-defined. In many cases, it is latin-1, but I am pretty certain other encodings are used for e.g. Asian translations.
This is is creating unnecessary problems when working with the JDK code base, for no reason other than historical baggage.
As JEP 400 (https://openjdk.org/jeps/400) confirms, UTF-8 is the way to go. We should follow up on this by converting our code base to UTF-8.
This includes basically the following steps:
* Tell git that the text files are encoded in UTF-8
* Look through the code base for text files containing non-ASCII characters, and convert them to UTF-8, if they are not already
* Update tooling used in building to recognize the fact that files are now in UTF-8 and treat them accordingly (basically, updating compiler flags).
Possibly, we should also:
* Update jcheck to verify that changes do not contain invalid UTF-8 encodings.
This is is creating unnecessary problems when working with the JDK code base, for no reason other than historical baggage.
As JEP 400 (https://openjdk.org/jeps/400) confirms, UTF-8 is the way to go. We should follow up on this by converting our code base to UTF-8.
This includes basically the following steps:
* Tell git that the text files are encoded in UTF-8
* Look through the code base for text files containing non-ASCII characters, and convert them to UTF-8, if they are not already
* Update tooling used in building to recognize the fact that files are now in UTF-8 and treat them accordingly (basically, updating compiler flags).
Possibly, we should also:
* Update jcheck to verify that changes do not contain invalid UTF-8 encodings.
- duplicates
-
JDK-8301854 C4819 warnings were reported in libfreetype on Windows
- Closed
-
JDK-8301855 C4819 warnings were reported in harfbuzz on Windows
- Closed
- relates to
-
JDK-8134455 Clean out non-ASCII characters from source code
- Open
- links to