-
Bug
-
Resolution: Fixed
-
P3
-
1.1.3, 1.1.6, 1.2.0
-
1.2beta2
-
generic, sparc
-
solaris_2.5.1
-
Verified
Name: laC46010 Date: 10/17/97
Java language permits to use Unicode-escapes to represent
various source characters including line terminators.
It means for example that \u000D may be used to terminate
single-line comment:
// text text \u000D int i=1;
However Java compiler doesn't recognize \u000D ('CR' represented with
Unicode-escape) as LINE TERMINATOR.
JLS specifies that Unicode-escapes are processed before anything else
(3.2 Lexical Translations, p.12):
A raw Unicode character stream is translated into a sequence of
Java tokens, using the following three lexical translation
steps, which are applied in turn:
1. A translation of Unicode escapes in the raw stream of
Unicode characters to the corresponding Unicode character. A
Unicode escape of the form \uxxxx, where xxxx is a hexadecimal
value, represents the Unicode character whose encoding is
xxxx. This translation step allows any Java program to be
expressed using only ASCII characters.
2. A translation of the Unicode stream resulting from step 1
into a stream of input characters and line terminators.
3. A translation of the stream of input characters and line
terminators resulting from step 2 into a sequence of Java
input elements which, after white space and comments are
discarded, comprise the tokens that are the terminal symbols
of the syntactic grammar.
Note that the similar bug (bugID 4063147) for line terminators represented
with Unicode-escapes within string literals has been fixed in jdk1.2beta1.
The following JCK12beta1 tests are failed due to this bug:
lang/LEX/lex005/lex00591/lex00591.html
lang/LEX/lex054/lex05402/lex05402.html
lang/LEX/lex054/lex05403/lex05403.html
lang/LEX/lex058/lex05891/lex05891.html
See "lex00503" source and results below:
> /export/ld14/java/dest/jdk1.2P/solaris/bin/java -fullversion
java full version "JDK1.2P"
> /export/ld14/java/dest/jdk1.2P/solaris/bin/javac -d . lex00503.java
lex00503.java:17: Invalid character in input.
int a; \u000D
^
1 error
----------------------lex00503.java----------------------
// Ident: %Z%%M% %I% %E%
// Copyright %G% Sun Microsystems, Inc. All Rights Reserved
// Auto-generated with Jmpp
// Template Ident: @(#)lex00591.jmpp 1.1 97/10/10
package javasoft.sqe.tests.lang.lex005.lex00503;
import java.io.PrintStream;
public class lex00503 {
public static void main(String argv[]) {
System.exit(run(argv, System.out) + 95/*STATUS_TEMP*/);
}
public static int run(String argv[],PrintStream out) {
/*--- Line terminator `carriage return` as Unicode-escape. ---*/
int a; \u000D
return 0/*STATUS_PASSED*/;
}
}
-----------------------------------------------------------
======================================================================
- duplicates
-
JDK-4093090 javac erroneously compiles // style comment containing '\u000A'
- Closed
- relates to
-
JDK-4112770 1.1 compiler can't accept unicode-escape \u000D
- Closed