-
Bug
-
Resolution: Duplicate
-
P3
-
None
-
5.0
-
generic
-
solaris_8
Currently, class files that contain non-ascii characters cause no end of problems for VM implementatios. File system support for unicode characters in file names differs among platforms, and is therefore not reliable. In general, many preograms written using non-US locales are forced to use ASCII characters for their class names for portability, which undermines the internationalization support of the Java language and platform. Even the jar file implementation, based as it is upon an underlying C implementation, has trouble with file names stored within it.
We propose that the mapping from class name to file name should be specified to translate any non- ascii-printable characters in the class or package name into the characters "+Unnnn" where "+" is the ascii plus character, U is the ascii "U" character, and nnnn are the hex digits of the unicode representation of the character. Surrogates (i.e., a pair of such encodings) would be used for characters in the higher unicode planes. No changes in the language, VM, or API specifications are needed; this mapping was never specified in any existing document. However, we recommend this mapping be described in either or both of the new JLS and JVMS documents beginning in Tiger.
On loading a given class that contains non- ascii-printable characters, the VM could try both locations (that is, the rewritten file name AND the non-rewritten file name). That allows full backward compatibility with existing class files that may use non-ascii characters.
If we decide to so this, changes are required in javac and the standard class loaders.
We propose that the mapping from class name to file name should be specified to translate any non- ascii-printable characters in the class or package name into the characters "+Unnnn" where "+" is the ascii plus character, U is the ascii "U" character, and nnnn are the hex digits of the unicode representation of the character. Surrogates (i.e., a pair of such encodings) would be used for characters in the higher unicode planes. No changes in the language, VM, or API specifications are needed; this mapping was never specified in any existing document. However, we recommend this mapping be described in either or both of the new JLS and JVMS documents beginning in Tiger.
On loading a given class that contains non- ascii-printable characters, the VM could try both locations (that is, the rewritten file name AND the non-rewritten file name). That allows full backward compatibility with existing class files that may use non-ascii characters.
If we decide to so this, changes are required in javac and the standard class loaders.
- duplicates
-
JDK-4421728 Specification required for mapping from class names to file names
-
- Closed
-