Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4193134

(1.1) Problems loading classes named with non-English characters

XMLWordPrintable

    • generic, x86, sparc
    • generic, solaris_2.5.1, solaris_9, windows_95, windows_nt



      Name: mf23781 Date: 11/26/98


      If the following testcase is compiled on a Win32 system it will not run. I
      believe (though I cannot verify) that it will fail on Solaris too:

      ------ cut here ----- Test.java -----
      class Test
      {
        public static void main(String args[]) {
           /* Should be able to load the class using both these methods */
           try {
              Class test = Class.forName("Test_\u00e9");
           } catch (Exception e) {
              System.out.println(e);
           }
           Test test = new Test_\u00e9();
        }
      }

      class Test_\u00e9 extends Test {
        public Test_\u00e9() {
           System.out.println("This is " + this);
        }
      }
      ----- cut here ----- end of Test.java -----

      The key point here is that it contains a class whose name includes a character
      outside the standard 7bit ASCII set. In this case I have used the unicode
      character \U00E9 which is a "lowercase e acute" -- a perfectly valid character
      for class names and one that occurs in the standard Latin-1 character set
      supported by the base Win32 codepage (and I believe the standard Solaris codepage).

      A problem occurs because internally the JVM processes this classname as a UTF-8
      encoded string and it is this UTF-8 form that gets passed into the function
      "LoadClassLocally()" in classloader.c. This function simply takes the string
      passed to it and tries to open class file using that name; clearly this will
      fail as the actual class file on disk has the proper "e acute" character in its
      filename not the UTF-8 encoding.

      The obvious fix is for the "LoadClassFromFile()" function to map the UTF-8
      classname it receives back to platform encoding before trying to open the file.
      However there are a least two things to consider if this is done and I am not
      in a position to determine the correct resolution.

      1. During JVM initialisation the classes needed to map UTF-8 to local platform
         encoding may themselves not yet be loaded.

      2. The initial classname passed from the JAVA command line does NOT get turned
         into UTF-8 form before being passed to the classloader - this is an incon-
         sistency in itself - and therefore, more by luck than design, this class
         file is currently found. However it still won't load as the name embedded
         in the classfile header is in UTF-8 form and so fails to match the actual
         class name causing the classloader to reject the class.


      ======================================================================

            Unassigned Unassigned
            miflemi Mick Fleming
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: