Resolution: Fixed
1.4.0, 1.4.2
Name: gm110360 Date: 12/14/2001
java version "1.4.0-beta3"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta3-b84)
Java HotSpot(TM) Client VM (build 1.4.0-beta3-b84, mixed mode)
> http://java.sun.com/products/jdk/1.2/compatibility.html
> Runtime Incompatibilities in Version 1.2
> In JDK 1.2 software
> the -Xfuture option enables the strictest possible
> class-file format checks ...
If only this were true. Please try the demo below, see that something's
broken, tell me what it is, and fix it.
> reject... illegal UTF-8 strings
I hope I'm right to think you agree that vmspec UTF-8, by "4.4.7 The
CONSTANT_Utf8_info Structure", is only shortest form UTF-8 except that u0000,
if present, appears always as x C0 80?
By that definition, the `java -Xfuture` verification of .class file format
rejects a lot less than all forms of "illegal UTF-8 strings".
1) The verification never complains of not-shortest-form UTF-8. (Though it
does complain of the too-short-form x 00.)
2) The verification accepts truncated and ill-formed UTF-8 in string values,
attribute names, and unused entries.
We care because by design, vmspec UTF-8 defines precisely zero or one ways to
represent any sequence of chars. By defining more than one sequence of bytes
as equal to a given sequence of chars, we raise unanswerable questions. Does
one method override another? Is a field present? Is a constant initialiser
Now for the promised quick, rough demo of some of this. Try editing the binary
A.class after compiling this source:
class A
final static int theInt = 0x9ABCDEF0;
String theString = "ConstantValue";
class B
public static void main(String[] strings)
A a = new A();
String st = a.theString;
for (int index = 0; index < st.length(); ++index)
char ch = st.charAt(index);
System.out.println("x" + Integer.
In the binary A.class, confirm you see only one CONSTANT_Utf8_info entry that
equals "theInt":
01 00:06 74 68 65 49 6E 74 // theInt
See also that `java -Xfuture B` accepts the A.class binary.
Now change the A.class binary. Change the trailing x74 to an xE0. See that
`java -Xfuture B` explodes, complaining of an "Illegal Field name". So far so
Now restore the original A.class binary (most simply, recompile it). Go find
the one entry of:
01 00:0D 43 6F 6E 73 74 61 6E 74 56 61 6C 75 65 // ConstantValue
Change the trailing x65 to an xE0. See that `java -Xfuture B` is happy.
Conclude that string values and attribute names may contain truncated Utf.
Repeat, if you like, changing two trailing bytes, to see constant pool Utf may
contain ill-formed Utf, such as x D0 01 (b10xx:xxxx does not follow b110x:xxxx).
Repeat, if you like, changing three trailing bytes, to see constant pool Utf
may contain not-shortest-form Utf, such as x E0 90 81. So may field names, etc.
Please tell me what's broken and fix it - or unconfuse me!
Thanks in advance. Pat LaVarre
> http://developer.java.sun.com/developer/bugParade/
> +-Xfuture +utf
> 4 Results Found, Sorted by [lack of] Relevance
(Review ID: 136117)
java version "1.4.0-beta3"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta3-b84)
Java HotSpot(TM) Client VM (build 1.4.0-beta3-b84, mixed mode)
> http://java.sun.com/products/jdk/1.2/compatibility.html
> Runtime Incompatibilities in Version 1.2
> In JDK 1.2 software
> the -Xfuture option enables the strictest possible
> class-file format checks ...
If only this were true. Please try the demo below, see that something's
broken, tell me what it is, and fix it.
> reject... illegal UTF-8 strings
I hope I'm right to think you agree that vmspec UTF-8, by "4.4.7 The
CONSTANT_Utf8_info Structure", is only shortest form UTF-8 except that u0000,
if present, appears always as x C0 80?
By that definition, the `java -Xfuture` verification of .class file format
rejects a lot less than all forms of "illegal UTF-8 strings".
1) The verification never complains of not-shortest-form UTF-8. (Though it
does complain of the too-short-form x 00.)
2) The verification accepts truncated and ill-formed UTF-8 in string values,
attribute names, and unused entries.
We care because by design, vmspec UTF-8 defines precisely zero or one ways to
represent any sequence of chars. By defining more than one sequence of bytes
as equal to a given sequence of chars, we raise unanswerable questions. Does
one method override another? Is a field present? Is a constant initialiser
Now for the promised quick, rough demo of some of this. Try editing the binary
A.class after compiling this source:
class A
final static int theInt = 0x9ABCDEF0;
String theString = "ConstantValue";
class B
public static void main(String[] strings)
A a = new A();
String st = a.theString;
for (int index = 0; index < st.length(); ++index)
char ch = st.charAt(index);
System.out.println("x" + Integer.
In the binary A.class, confirm you see only one CONSTANT_Utf8_info entry that
equals "theInt":
01 00:06 74 68 65 49 6E 74 // theInt
See also that `java -Xfuture B` accepts the A.class binary.
Now change the A.class binary. Change the trailing x74 to an xE0. See that
`java -Xfuture B` explodes, complaining of an "Illegal Field name". So far so
Now restore the original A.class binary (most simply, recompile it). Go find
the one entry of:
01 00:0D 43 6F 6E 73 74 61 6E 74 56 61 6C 75 65 // ConstantValue
Change the trailing x65 to an xE0. See that `java -Xfuture B` is happy.
Conclude that string values and attribute names may contain truncated Utf.
Repeat, if you like, changing two trailing bytes, to see constant pool Utf may
contain ill-formed Utf, such as x D0 01 (b10xx:xxxx does not follow b110x:xxxx).
Repeat, if you like, changing three trailing bytes, to see constant pool Utf
may contain not-shortest-form Utf, such as x E0 90 81. So may field names, etc.
Please tell me what's broken and fix it - or unconfuse me!
Thanks in advance. Pat LaVarre
> http://developer.java.sun.com/developer/bugParade/
> +-Xfuture +utf
> 4 Results Found, Sorted by [lack of] Relevance
(Review ID: 136117)
- duplicates
JDK-4787534 jdk1.4.2 accepts invalid class files
- Closed