Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8049995

No standard charset available for widely used Java file format

    XMLWordPrintable

    Details

    • Type: Enhancement
    • Status: Resolved
    • Priority: P4
    • Resolution: Duplicate
    • Affects Version/s: 7
    • Fix Version/s: None
    • Component/s: core-libs
    • Labels:

      Description

      A DESCRIPTION OF THE REQUEST :
      Properties files use an ISO-8859-1 encoding with \u escapes for additional unicode characters. There doesn't appear to be a standard charset that supports this, which makes it difficult to generate properties files directly using PrintWriter, for example. However there's presumably some code that does this in the native2ascii tool.

      JUSTIFICATION :
      It's a bit odd that there isn't an obvious way to generate a valid properties file that contains non-Basic Latin characters.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Additional entry in java.nio.charsets.StandardCharsets for an ISO-8859-1 variant with \u escapes.
      ACTUAL -
      No standard charset available.

      ---------- BEGIN SOURCE ----------
      package charsets;

      import java.io.FileNotFoundException;
      import java.io.PrintWriter;

      import java.io.UnsupportedEncodingException;

      import java.nio.charset.StandardCharsets;

      public class OutputText
      {
        public static void main(String[] args)
          throws FileNotFoundException, UnsupportedEncodingException
        {
          PrintWriter writer =
            new PrintWriter(args[0], StandardCharsets.ISO_8859_1.name());
          writer.print("\u2019");
          writer.close();
        }
      }

      Run this supplying a filename such as /tmp/TestOutput.txt as an argument. Inspect the file and you'll see a single ? character, clearly not 2019, which is actually a right single quotation mark. Similarly if the charset is US_ASCII.
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      You need to provide your own routine for converting appropriate characters to \u escapes. I did this by processing each character individually - although I ignored supplementary planes.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              naoto Naoto Sato
              Reporter:
              coffeys Sean Coffey
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved:
                Imported: