Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5049382

compiler failed with "invalid" bytes in comments lines on UTF-8 environment

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P3 P3
    • None
    • 5.0
    • tools
    • generic
    • generic

      java version : 1.5.0-beta2 b51
      Platform : Solaris Sparc 9
      Locale : ja_JP.UTF-8, zh_CN.UTF-8, ... (any UTF-8 configs)

      If an "invalid" bytes are added as comments lines in a java source code as attached(HelloWorld.java), the compiler fails on UTF-8 locales. Compiler should have handled this by using the old way (1.4: silently replace them with a unicode replacement character).

      This issue is caused by a CCC4767128 putback. I agree the compilation fails if the "invalid" bytes are not java comments. Putting native characters in comments lines are an expected and common bahavior for a non-English speaker end-users.

      On the other hand, even though the HelloWorld.java includes "invalid" bytes, it doesn't have any compilation issues if the locale is not setup to a UTF-8 locale. This seems doesn't match the CCC strictly. The compilation should also fail as designed. I tried zh_CN.GBK, ja_JP.eucJP, zh_CN.eucCN. BTW, the "invalid" bytes are under eucJP encoding.

      Produce steps:
      1. get HelloWorld.java from bugtraq (attached)
      2. set locale to any UTF-8 locale
         setenv LC_ALL ja_JP.UTF-8
      3. compile it with b49 or after of java beta2

      ###@###.### 2004-05-18

            Unassigned Unassigned
            shuwu Shuna Wu (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: