Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4867251

OutputStreamWriter/InputSreamReader convert NEL to linefeed with Cp037 encoding

    XMLWordPrintable

Details

    • Bug
    • Resolution: Not an Issue
    • P4
    • None
    • 1.4.2
    • core-libs

    Description

      Name: rmT116609 Date: 05/20/2003


      FULL PRODUCT VERSION :
      java version "1.4.2-beta"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b19)
      Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)

      FULL OS VERSION :
      Linux stallion.elharo.com 2.4.18-6mdk #1 Fri Mar 15 02:59:08 CET 2002 i686 unknown

      A DESCRIPTION OF THE PROBLEM :
      When InputStreamReader is using the Cp037 (EBCDIC US) encoding and reads a NEL (Unicode 0x85 and EBCDIC 0x15) it converts it into a linefeed (\n). When OutputStreamWriter writes a linefeed in the Cp037, it instead writes a NEL.

      NEL and linefeed are *not* the same character. Cp037 has separate, distinct code points for linefeed and NEL. It is important for XML parsing, among other uses, that they not be confused. The linefeed character qualifies for white space in XML. NEL does not. Several XML parsers have serious errors as a result of depending on Java to convert EBCDIC to Unicode.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Run attached program

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      This code should output and read in all three common line end chars: NEL, linefeed, and carriage return. In both cases only two are seen. On output all linefeeds are changed to NELs. On input all NELs are changed to linefeeds.
      ACTUAL -
      Testing input stream
      10
      10
      10
      10
      10
      10
      10
      10
      13
      13
      13
      13
      Testing output stream
      0x15
      0x15
      0x15
      0x15
      0xD
      0xD
      0xD
      0xD
      0x15
      0x15
      0x15
      0x15
      0xD
      0x15
      0xD
      0x15
      0xD
      0x15
      0xD
      0x15

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import java.io.*;

      public class NELTest {
       
        public static void main(String[] args) throws Exception {
        
          System.out.println("Testing input stream");
          byte[] data = {(byte) 0x15, (byte) 0x15, (byte) 0x15, (byte) 0x15, (byte) 0x25, (byte) 0x25, (byte) 0x25, (byte) 0x25, (byte) 13, (byte) 13, (byte) 13, (byte) 13};
          ByteArrayInputStream in = new ByteArrayInputStream(data);
          InputStreamReader reader = new InputStreamReader(in, "Cp037");
          int c;
          while ((c = reader.read()) != -1) {
              System.out.println(c);
          }
           
          System.out.println("Testing output stream");
          ByteArrayOutputStream out = new ByteArrayOutputStream();
          OutputStreamWriter writer = new OutputStreamWriter(out, "Cp037");
          writer.write((char) 0x85);
          writer.write((char) 0x85);
          writer.write((char) 0x85);
          writer.write((char) 0x85);
          writer.write((char) 13);
          writer.write((char) 13);
          writer.write((char) 13);
          writer.write((char) 13);
          writer.write((char) 10);
          writer.write((char) 10);
          writer.write((char) 10);
          writer.write((char) 10);
          writer.write((char) 13);
          writer.write((char) 10);
          writer.write((char) 13);
          writer.write((char) 10);
          writer.write((char) 13);
          writer.write((char) 10);
          writer.write((char) 13);
          writer.write((char) 10);
          writer.flush();
          writer.close();
          
          byte[] result = out.toByteArray();
          for (int i = 0; i < result.length; i++) {
              System.out.println("0x" + Integer.toHexString(result[i]).toUpperCase());
          }
           
        }
          
          
      }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      I've written my own special purpose EBCDIC writer that correctly converts NELs to linefeeds. For input I don't yet have a workaround, since the bug tends to manifest itself fairly deeply inside XML parsers.
      (Review ID: 185599)
      ======================================================================

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              rmandalasunw Ranjith Mandala (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: