-
Bug
-
Resolution: Duplicate
-
P4
-
None
-
1.4.0
-
generic
-
other
When running W3C valicator http://validator.w3.org/
on 1.4.x, get the following warning:
Warning: No Character Encoding detected! To assure correct validation,
processing, and display, it is important that the character encoding is
properly labeled.
The document character set for XML and HTML 4.0 is Unicode (aka ISO 10646).
This means that HTML browsers and XML processors should behave as if they
used Unicode internally. But it doesn't mean that documents have to
be transmitted in Unicode. As long as client and server agree on the
encoding, they can use any encoding that can be converted to Unicode.
It is very important that the character encoding of any XML or (X)HTML
document is clearly labeled . This can be done in the following ways:
- For HTML, use the <meta> tag. Example:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
With this information, clients can easily map these encodings to Unicode.
In practice, a few encodings will be preferred, most likely: ISO-8859-1
(Latin-1), US-ASCII , UTF-8 , UTF-16 , the other encodings in the
ISO-8859 series, iso-2022-jp , euc-kr , and so on.
Source: http://www.w3.org/International/O-charset.html
However, the I18N team has previously warned against using charsets in meta
tags, so this discrepancy needs to be resolved.
on 1.4.x, get the following warning:
Warning: No Character Encoding detected! To assure correct validation,
processing, and display, it is important that the character encoding is
properly labeled.
The document character set for XML and HTML 4.0 is Unicode (aka ISO 10646).
This means that HTML browsers and XML processors should behave as if they
used Unicode internally. But it doesn't mean that documents have to
be transmitted in Unicode. As long as client and server agree on the
encoding, they can use any encoding that can be converted to Unicode.
It is very important that the character encoding of any XML or (X)HTML
document is clearly labeled . This can be done in the following ways:
- For HTML, use the <meta> tag. Example:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
With this information, clients can easily map these encodings to Unicode.
In practice, a few encodings will be preferred, most likely: ISO-8859-1
(Latin-1), US-ASCII , UTF-8 , UTF-16 , the other encodings in the
ISO-8859 series, iso-2022-jp , euc-kr , and so on.
Source: http://www.w3.org/International/O-charset.html
However, the I18N team has previously warned against using charsets in meta
tags, so this discrepancy needs to be resolved.
- duplicates
-
JDK-4756688 stddoclet: Combine -docencoding and -charset options
-
- Closed
-