-
Enhancement
-
Resolution: Won't Fix
-
P4
-
None
-
6u10
-
None
-
sparc
-
solaris_7
XML parsers are only required [1] to recognize and process UTF-8 and UTF-16. In practice XML parser implementations recognize more encodings than that but the recommendation from the XML spec is that they be referred to by their registered IANA names.
Name: ISO_8859-1:1987 [RFC1345,KXS2]
MIBenum: 4
Source: ECMA registry
Alias: iso-ir-100
Alias: ISO_8859-1
Alias: ISO-8859-1 (preferred MIME name)
Alias: latin1
Alias: l1
Alias: IBM819
Alias: CP819
Alias: csISOLatin1
There is a feature called "http://apache.org/xml/features/allow-java-encodings" which causes the parser to also recognize some Java encoding names [3] but this is limited to the canonical names like "ISO8859_1". The way in which the parser is being used here isn't portable; won't work with all parser implementations and might not work across platforms. The XML document itself should state its encoding is ISO-8859-1 and/or the parser should be provided with that same string with setEncoding(). Alternatively the application could provide a java.io.Reader (e.g. java.io.InputStreamReader) as input if it knows the encoding.
Its expected that the property returns a standard IANA compatible string like "ISO-8859-1". JS was using "8859_1" earlier and started using "ISO8859_1"
(both of these not supported by IANA). It seems sensible to have the default file.encoding set to be the standard registered name for the encoding ie "ISO-8859-1" rather than one of it's aliases.
We wanted to know the reason behind JS supporting "ISO8859_1" and not the standard "ISO-8859-1".
Name: ISO_8859-1:1987 [RFC1345,KXS2]
MIBenum: 4
Source: ECMA registry
Alias: iso-ir-100
Alias: ISO_8859-1
Alias: ISO-8859-1 (preferred MIME name)
Alias: latin1
Alias: l1
Alias: IBM819
Alias: CP819
Alias: csISOLatin1
There is a feature called "http://apache.org/xml/features/allow-java-encodings" which causes the parser to also recognize some Java encoding names [3] but this is limited to the canonical names like "ISO8859_1". The way in which the parser is being used here isn't portable; won't work with all parser implementations and might not work across platforms. The XML document itself should state its encoding is ISO-8859-1 and/or the parser should be provided with that same string with setEncoding(). Alternatively the application could provide a java.io.Reader (e.g. java.io.InputStreamReader) as input if it knows the encoding.
Its expected that the property returns a standard IANA compatible string like "ISO-8859-1". JS was using "8859_1" earlier and started using "ISO8859_1"
(both of these not supported by IANA). It seems sensible to have the default file.encoding set to be the standard registered name for the encoding ie "ISO-8859-1" rather than one of it's aliases.
We wanted to know the reason behind JS supporting "ISO8859_1" and not the standard "ISO-8859-1".