Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P4
Fix Version/s: 1.4.2
Affects Version/s: 1.4.2
Component/s: docs
Labels:
- webwsdoc

Subcomponent:
guides
Resolved In Build:
rc
CPU:

generic
OS:

other

Why do we bother to remove hard-coded charset attributes from the <META>
tags in the J2SE guide docs, as reported below.

If the translators change the charset appropriately,
is leaving the charset explicit really a problem?

Here are the results of a charset check:

  Date: Sun, 6 Apr 2003 10:04
  To: ###@###.###
  Subject: charset META tags in J2SE 1.4.2 Docs

  Results of search for 1.4.2 .html files that contain <META...charset=...
  tags in:
  /usr/web/work/j2se/1.4.2/docs

  RATIONALE
  ---------
  META...charset=iso-8859-1 (American ANSI) docs cannot be viewed
  in Japanese browsers. META...charset=JA.. docs cannot be viewed in
  English browsers. Such docs specify a character set that only works
  in a single locale. And since the tags are buried in the HTML
  header, they are hard for translators to see.)

  The files listed below have that tag.
  ---------------------------------------
  guide/2d/spec/j2d-awt.html
  guide/2d/spec/j2d-bookTOC.html
  guide/2d/spec/j2d-color.html
  guide/jdbc/getstart/table8.7.html
  guide/jni/spec/acknowledge.html
  guide/jws/relnotes.html
  guide/net/relnotes.html
  guide/net/ja/relnotes.html
  guide/security/jgss/jgss-features.html
  guide/serialization/spec/version.html
  guide/versioning/spec/versioningTOC.html
  install-notes/disk-space.html
  install-notes/SCCS-ORIG/s.disk-space.html
  relnotes/license.html
  ------------------------------------------

###@###.### wrote:
Once again:

A correct charset tag is best, it tells the browser how to interpret the page correctly and requires no user intervention.
No charset tag is OK, this lets the user guess at and select the character encoding.
A wrong charset tag is bad, because it causes the browser to display garbage and prevents the user from correcting the situation.

The old rule, to not have a charset tag, was based on the assumption that the translators would not be able to adjust the tag when they translate the text. Last year we found out that they are able to do this. So, I think we should change the rule and require a correct charset tag.

A few other corrections:

- 8859 is not a charset, but the number of an ISO standard that defines a series of character encodings. Correct charset names are iso-8859-1, iso-8859-2, etc. to iso-8859-10, iso-8859-13, iso-8859-15.

- iso-8859-1 is not American ANSI. ANSI is the American National Standards Institute, which is a member of ISO and which, among other things, defined ASCII, the American Standard Code for Information Interchange. ISO is the International Organization for Standardization, which, among other things, defined the ISO 8859 series of character encodings, all of which are extensions of ASCII. iso-8859-1 is the charset name of the first character encoding in the ISO 8859 series.

- There is no charset JA. There are a number of Japanese character encodings, with charset names such as euc-jp and shift_jis.

- Japanese browsers can display pages encoded in iso-8859-1, and most English browsers nowadays can display pages encoded in Japanese encodings. What they cannot do is display pages that are encoded in a different character encoding than indicated by their charset tag.

Assignee:: Douglas Kramer (Inactive)

Reporter:: Douglas Kramer (Inactive)

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Created:: 2003-04-11 20:57

Updated:: 2003-04-12 15:29

Resolved:: 2003-04-12 15:29

Imported:: 15/Sep/12 8:49 AM

Indexed:: 17/Jul/12 6:36 AM

Details

Description

Attachments

Activity

People

Dates