Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8224758

Default Charset code is misleading and contradictory to actual output

XMLWordPrintable

      A DESCRIPTION OF THE PROBLEM :
      The code for Charset.defaultCharset() is written in a way that if it is unable to find file.encoding in the vm params it initialises defaultCharset to UTF-8. However the else statement here is actually dead code if you consider the vm holistically, The reason i am stating this is that if you don't pass file.encoding param to the vm it tries to infer the value based on LC_ALL, LANG, LC_CTYPE and even if the are not set the file.encoding gets initialised to US_ASCII. So there is actually a contradiction in these two processes i.e. the initialisation of file.encoding and Charset.defaultCharset() code, while one is giving signal that the encoding default should be UTF-8 the other is making it to US_ASCII

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. Remove the environment variables LC_ALL, LANG, LC_CTYPE from your shell.
      2. Write a code in java to invoke Charset.defaultCharset() and print result.
      3. Invoke the code without specifying file.encoding param.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      The result will be US_ASCII
      ACTUAL -
      The actual result should be UTF-8 or the code in Charset.defaultCharset() should be changed to US_ASCII too to make it consistent.

      CUSTOMER SUBMITTED WORKAROUND :
      The workaround is to pass -Dfile.encoding=UTF-8 so that it matches with the expected default in Charset.defaultCharset()

            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: