1. (4402549) The jdk1.4.0beta-b45 API specification for method Character.isMirrored(ch) doesn't specify the return value for undefined chars.
Unicode3.0 standard doesn't specify mirrored property for undefined chars.
jdk1.4.0beta-b45 API implementation returns value false for all undefined chars.
2. (4402548) The jdk1.4.0beta-b45 API specification for method Character.getDirectionality(ch) doesn't specify the return value for undefined chars. According to Unicode standard the directional type for all unassigned code values is not defined.
According to Unicode standard the directional type for all unassigned
code values is not defined but jdk1.4.0beta-b45 API implementation returns value
DIRECTIONALITY_LEFT_TO_RIGHT (i.e. 0) for all undefined chars.
The following simple test shows this:
public class test {
public static void main(String[] args){
String str = "";
for (int i = 0; i <= 65535; ++i){
if (Character.isDefined((char)i) != true){
str += " " + Character.getDirectionality((char)i);
}
}
System.out.println(str);
}
}
3. (4402127) Character.getNumericValue(ch) method returns incorrect values for the following
chars:
0x41 0x42 0x43 0x44 0x45 0x46 0x47 0x48 0x49 0x4a 0x4b 0x4c 0x4d 0x4e 0x4f 0x50
0x51 0x52 0x53 0x54 0x55 0x56 0x57 0x58 0x59 0x5a 0x61 0x62 0x63 0x64 0x65 0x66
0x67 0x68 0x69 0x6a 0x6b 0x6c 0x6d 0x6e 0x6f 0x70 0x71 0x72 0x73 0x74 0x75 0x76
0x77 0x78 0x79 0x7a 0xff21 0xff22 0xff23 0xff24 0xff25 0xff26 0xff27 0xff28
0xff29 0xff2a 0xff2b 0xff2c 0xff2d 0xff2e 0xff2f 0xff30 0xff31 0xff32 0xff33
0xff34 0xff35 0xff36 0xff37 0xff38 0xff39 0xff3a 0xff41 0xff42 0xff43 0xff44
0xff45 0xff46 0xff47 0xff48 0xff49 0xff4a 0xff4b 0xff4c 0xff4d 0xff4e 0xff4f
0xff50 0xff51 0xff52 0xff53 0xff54 0xff55 0xff56 0xff57 0xff58 0xff59 0xff5a
jdk1.4.0beta-b45 specification reads:
" public static int getNumericValue(char ch)
Returns the int value that the specified Unicode character represents.
For example, the character '\u216C' (the roman numeral fifty) will return
an int with a value of 50.
If the character does not have a numeric value, then -1 is returned.
If the character has a numeric value that cannot be represented as a nonnegative
integer (for example, a fractional value), then -2 is returned."
According to this, Character.getNumericValue(ch) should return -1 for the specified
above chars, since Unicode3.0 defines no numeric values for these chars.
However, jdk1.4.0beta-b45 API implementation returns the following corresponding
decimal values:
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 10
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 10 11
12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 10 11 12
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
Due to this new JCK Merlin test api/java_lang/Character/index.html#charFullRange[Character2084]
fails.
4. (4401684) The jdk1.4.0beta-b45 specification for method Character.isUnicodeIdentifierStart(ch) reads:
"public static boolean isUnicodeIdentifierStart(char ch)
Determines if the specified character is permissible as the first character
in a Unicode identifier. A character may start a Unicode identifier if and
only if it is a letter..."
However, jdk1.4beta-b45, jdk1.3, jdk1.2.2 API implementations consider not only
letters to be a Unicode identifier start, but also characters whose Unicode general type
is "Nl" which are not letters (according to the Character.isLetter(ch) method specification).
This concerns Character.isJavaIdentifierStart(ch) method as well.
The following simple test shows this:
public class test {
public static void main(String[] args){
for (int i = 0; i <= 65535; ++i){
if (Character.isLetter((char) i) != Character.isUnicodeIdentifierStart((char) i)){
System.out.print("0x" + Integer.toHexString(i));
}
}
}
}
5. (4401683) jdk1.4.0beta-b45 API specification for the method Character.toTitleCase(ch)
is inaccurate. It states:
"public static char toTitleCase(char ch)
Converts the character argument to titlecase using case mapping information from
the UnicodeData file. If a character has no explicit titlecase mapping according to
UnicodeData, then the uppercase mapping is returned as an equivalent titlecase mapping."
This is incorrect algorithm for those chars of Unicode category "Lt" which are
titlecase chars themselves but also have uppercase mappings that differ
from the char's codepoints.
For example, for the following chars Unicode3.0.0 defines:
CODEPOINT UPPER_CASE LOWER_CASE TITLE_CASE CATEGORY
0x01C5 0x01C4 0x01C6 no "Lt"
0x01C8 0x01C7 0x01C9 no "Lt"
0x01CB 0x01CA 0x01CC no "Lt"
0x01F2 0x01F1 0x01F3 no "Lt"
Following the specified algorithm Character.toTitleCase((char)0x01C5) should return
0x01C4, but in fact jdk1.4.0beta-b45 Character.toTitleCase((char)0x01C5) returns
correct 0x01C5.
6. (4395328) Character api doc(isWhitespace() method description) in build 1.4.0beta-b43 api doc has two wrong information.
", but is not a no-break space (\u00A0 or \uFEFF)"
. \uFEFF is not one of SPACE_SEPARATOR, LINE_SEPARATOR, and PARAGRAPH_SEPARATOR type.
. (\u00A0 or \uFEFF) should be (\u00A0, \u202F, or \u2007)
Please see 4395323 for 202f and 2007.
New non-breaking space (and other non-breaking) chars have been added to Unicode 3.0 spec. The following separators should be excluded from the set; they should return false:
00a0
2007
202f
All of these bugids indicate that the Character javadoc needs updating in their specific areas. In most cases, the spec needs to document intended but undocumented behavior.
- duplicates
-
JDK-4395328 wrong api doc information in Character.isWhitespace method
-
- Closed
-
-
JDK-4401683 Inaccurate specification for Character.toTitleCase(ch)
-
- Closed
-
-
JDK-4401684 Incomplete specification for Character.isUnicodeIdentifierStart(ch)
-
- Closed
-
-
JDK-4402127 Character.getNumericValue(ch) returns incorrect values
-
- Closed
-
-
JDK-4402548 Character.getDirectionality(ch) returns wrong values
-
- Closed
-
-
JDK-4402549 incomplete specification for Character.isMirrored(ch)
-
- Closed
-
- relates to
-
JDK-4427146 class Character: 3 methods return invalid values
-
- Closed
-