-
Bug
-
Resolution: Won't Fix
-
P5
-
None
-
5.0
-
x86
-
linux
A DESCRIPTION OF THE PROBLEM :
The documentation for ISO 8859-1 encoding specifies "ISO 8859-1" as the supported encoding, and in fact "ISO-8859-1" is the actual encoding implemented. The difference in name is the space versus the dash after the ISO name.
The difference in the encodings is that ISO-8859-1 contains additional control characters not found in ISO 8859-1. In specific, for all Unicode values between 0 and 255, isEncodable as ISO-8859-1 should return true, but for some characters in 0 to 255, isEncodable should return false for ISO 8859. Those characters are:
"Code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-1." quoting Wrom: WOYIYZUNNYCGPKYLEJGDGVCJVTLBXFGGMEPYOQK
It would be correct to specify ISO-8859-1, and it would be helpful as a developer to explicitly know that it includes the entire range 0-255 including control characters by stating something like, "including control characters from ISO 6429", or by adding something like "Note: this is different from ISO 8859-1 in that it includes control characters from ISO 6429 in the ranges 00-1F, 7F and 80-9F".
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
(The formatting is in a table, so I'm just reproducing a portion)
NIO Name, IO Name, Encoding Name:
ISO-8859-1 ISO8859_1 ISO-8859-1, Latin Alphabet No. 1. Note that this is different from ISO 8859-1 in that control characters in the range 00-1F, 7F and 80-9F are included.
ACTUAL -
NIO name, IO Name, Encoding Name:
ISO-8859-1 ISO8859_1 ISO 8859-1, Latin Alphabet No. 1
URL OF FAULTY DOCUMENTATION :
http://java.sun.com/j2se/1.5.0/docs/guide/intl/encoding.doc.html
The documentation for ISO 8859-1 encoding specifies "ISO 8859-1" as the supported encoding, and in fact "ISO-8859-1" is the actual encoding implemented. The difference in name is the space versus the dash after the ISO name.
The difference in the encodings is that ISO-8859-1 contains additional control characters not found in ISO 8859-1. In specific, for all Unicode values between 0 and 255, isEncodable as ISO-8859-1 should return true, but for some characters in 0 to 255, isEncodable should return false for ISO 8859. Those characters are:
"Code values 00-1F, 7F, and 80-9F are not assigned to characters by ISO/IEC 8859-1." quoting Wrom: WOYIYZUNNYCGPKYLEJGDGVCJVTLBXFGGMEPYOQK
It would be correct to specify ISO-8859-1, and it would be helpful as a developer to explicitly know that it includes the entire range 0-255 including control characters by stating something like, "including control characters from ISO 6429", or by adding something like "Note: this is different from ISO 8859-1 in that it includes control characters from ISO 6429 in the ranges 00-1F, 7F and 80-9F".
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
(The formatting is in a table, so I'm just reproducing a portion)
NIO Name, IO Name, Encoding Name:
ISO-8859-1 ISO8859_1 ISO-8859-1, Latin Alphabet No. 1. Note that this is different from ISO 8859-1 in that control characters in the range 00-1F, 7F and 80-9F are included.
ACTUAL -
NIO name, IO Name, Encoding Name:
ISO-8859-1 ISO8859_1 ISO 8859-1, Latin Alphabet No. 1
URL OF FAULTY DOCUMENTATION :
http://java.sun.com/j2se/1.5.0/docs/guide/intl/encoding.doc.html