Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4803820

jre default charset under linux is incompatible with certain locales/languages

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Not an Issue
    • Icon: P3 P3
    • None
    • 1.4.1
    • core-libs

      Name: nt126004 Date: 01/15/2003


      FULL PRODUCT VERSION :
      java version "1.4.1_01"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01)
      Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode)


      FULL OPERATING SYSTEM VERSION :
      debian-linux
      Kernel: 2.4.19
      libc6: 2.3.1-5

      EXTRA RELEVANT SYSTEM CONFIGURATION :
      For hungarian keyboard I use:
      hunglish 1.12-1
      xkbsel 0.13-11LANG=C
      LC_CTYPE="C"
      LC_NUMERIC="C"
      LC_TIME="C"
      LC_COLLATE="C"
      LC_MONETARY="C"
      LC_MESSAGES="C"
      LC_PAPER="C"
      LC_NAME="C"
      LC_ADDRESS="C"
      LC_TELEPHONE="C"
      LC_MEASUREMENT="C"
      LC_IDENTIFICATION="C"
      LC_ALL=

      Locales are (xkbsel, hunglish doesn't change them)


      A DESCRIPTION OF THE PROBLEM :
      With jre 1.4.1 reading characters from the standard input
      without explicit charset, special characters are badly decoded.
      For characters like ??(aacute), ??(eacute) etc I just get
      question-marks, or little boxes if they appear in swing forms.
      If I specify that the input encoding is ISO-8859-2,
      everything works fine.
      With jre 1.4.0 the default charset guess of the JVM was
      correct, and I didn't had to specify the input charset.
      Because of this, I can't use local characters in swing forms
      at all.

      As I used it
      in my example, the InputStreamReader works fine, if I specify wich
      charset to use.

      The problem is, when I _don't_ specify it. In versions prior to 1.4.1 the
      linux jre used as default the ASCII charset (or something close
      to that like ISO-8859-1). Since 1.4.1 it seems like it's using Unicode.
      And this is a big difference, since simple linux distributions use ASCII
      charset, and not Unicode.

      An example: since I've upgraded my jre to 1.4.1, I can't use any
      characters above the ASCII code 128 in SunONE Studio 4 neither. This
      prevents me from using my language specific texts in SunONE Studio, so to
      write programs that speak my own language.

      I could rewrite my own programs like You guided me, but I can't recompile
      SunONE Studio!

      REGRESSION. Last worked in version 1.4

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. compile the source code
      2. run it with jre 1.4.1_01, enter eg: "??rv??zt??r?? t??k??rf??r??g??p"
      3. /tmp/testJava.txt contains: ?rv?zt?r? t?k?rf?r?g?p
      4. run with jre 1.4.0_3, testJava is ok

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      /tmp/testJava.txt contains: ?rv?zt?r? t?k?rf?r?g?p
      expected:
      ??rv??zt??r?? t??k??rf??r??g??p
      with jre 1.4.0_3 actually is ok

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      // here is a sample program which was used to generate the attached screenshot
      /*
       * textFrame.java
       *
       * Created on January 7, 2003, 8:38 PM
       */

      /**
       *
       * @author mrbig
       */
      public class textFrame extends javax.swing.JFrame {
          
          /** Creates new form textFrame */
          public textFrame() {
              initComponents();
          }
          
          /** This method is called from within the constructor to
           * initialize the form.
           * WARNING: Do NOT modify this code. The content of this method is
           * always regenerated by the Form Editor.
           */
          private void initComponents() {//GEN-BEGIN:initComponents
              jTextArea1 = new javax.swing.JTextArea();

              addWindowListener(new java.awt.event.WindowAdapter() {
                  public void windowClosing(java.awt.event.WindowEvent evt) {
                      exitForm(evt);
                  }
              });

              getContentPane().add(jTextArea1, java.awt.BorderLayout.CENTER);

              pack();
              java.awt.Dimension screenSize = java.awt.Toolkit.getDefaultToolkit().getScreenSize();
              setSize(new java.awt.Dimension(200, 150));
              setLocation((screenSize.width-200)/2,(screenSize.height-150)/2);
          }//GEN-END:initComponents
          
          /** Exit the Application */
          private void exitForm(java.awt.event.WindowEvent evt) {//GEN-FIRST:event_exitForm
              System.exit(0);
          }//GEN-LAST:event_exitForm
          
          /**
           * @param args the command line arguments
           */
          public static void main(String args[]) {
              new textFrame().show();
          }
          
          
          // Variables declaration - do not modify//GEN-BEGIN:variables
          private javax.swing.JTextArea jTextArea1;
          // End of variables declaration//GEN-END:variables
          
      }

      // here is another program which demonstrates the problem
      import java.io.*;

      public class test {
         public static void main(String[] args) {
            try {
               // Using the system default charset
               // Works on 1.4.0_03, wrong on 1.4.1_01
               LineNumberReader reader = new LineNumberReader(new
      InputStreamReader(System.in));
               
               // Alternatively this works on 1.4.1_01
               // LineNumberReader reader = new LineNumberReader(new
      InputStreamReader(System.in, "ISO-8859-2"));
               String txt = reader.readLine();
               
               // Output to a file, explicitly in ISO-8859-2
               PrintWriter writer = new PrintWriter(new OutputStreamWriter(new
      FileOutputStream("/tmp/testJava.txt"),"ISO-8859-2"));
               writer.print(txt);
               writer.close();
            }
            catch (Exception e) {
               System.out.println(e);
            }
         }
      }

      ---------- END SOURCE ----------

      CUSTOMER WORKAROUND :
      Explicitly setting jre default encoding to ISO-8859-1? Is
      this possible?
      (Review ID: 179080)
      ======================================================================

            ilittlesunw Ian Little (Inactive)
            nthompsosunw Nathanael Thompson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: