Loading...

Type: Bug
Resolution: Fixed
Priority: P4
Fix Version/s: 1.3.1_12
Affects Version/s: 1.3.1_09
Component/s: core-libs
Labels:

Subcomponent:
java.nio.charsets
Resolved In Build:
12
CPU:

sparc
OS:

solaris_8
Verification:
Verified

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-2076635	5.0	Ian Little	P4	Resolved	Fixed	b38
JDK-2076634	1.4.2_05	Ian Little	P4	Closed	Fixed	05

Name: dk106046 Date: 10/31/2003

Operating System(s) :
Sun Solaris 2.8

Full JDK version(s) (from java -version) :
java version "1.3.1_09"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_09-b03)
Java HotSpot(TM) Client VM (build 1.3.1_09-b03, mixed mode)

Detailed description of the problem:
EBCDIC lines of text are being converted to Kanji but it is noticed that some characters do not convert correctly, for example the hyphen character. This problem is not noticed when using Java 1.2.2_17.

- Exact steps to reproduce:

1 Detach the java files and FTP as Binary to Solaris
2 compile with appropriate JDK
3 Run as java CallConverter > jdk131-09.html
4 FTP the jdk131-09.html back to windows as binary
5 Open the jdk131-09.html file in IE5.50 or above should be okay.
6 Goto View->Encoding and select Japanese (Shift-JIS) to view the
correct charercter set.
There would be a circle in the output that is the unwanted character. This is circled in red in the word doc (picjdk131-04.doc available on request). The expected output is seen in the html doc (outputFromJDK1.1.8.html available on request).

- Source code that demonstrates the problem:

=============== CallConverter.java ==========================================

public class CallConverter {

    public static void main(String args []){
        //This is what came back from the mainframe
        String input = "0E43CE438A43A8404044C445BC45B6459A45864040426045804567455240404586458545530F";
        //Hexify the input
        String line = CharacterConverter.getInstance().hexifyString(input);
   //Convert from CodePage930 to CodePage943
        if (line != null && line.length() != 0) {
            System.out.println(CharacterConverter.getInstance().charCodeConvert(line,"Cp930","Cp943"));
        }

    }
}

=============== CharacterConverter.java =======================================

import java.io.UnsupportedEncodingException;
public class CharacterConverter {
   private static String defaultCode = "ISO8859-1";
        private static CharacterConverter instance = new CharacterConverter();

private CharacterConverter()
{
        super();
}
public static CharacterConverter getInstance()
{
        return instance;
}
public String hexifyString(String stringToHexify)
{
        String errMsg = null;
        String tempHex = "";

        // Parse input string to strip out unnecessary 00's and FF's
        boolean shiftout = true;
        int hexIdx = 0;
        int len = stringToHexify.length();

  if ((len % 2) != 0)
        {
               System.out.println("len%2 s");
                return null;
        }

        while (hexIdx < len)
        {
                // Delete 00's and FF's
                if ((stringToHexify.charAt(hexIdx) == '0' && stringToHexify.charAt(hexIdx + 1) == '0') || (stringToHexify.charAt(hexIdx) == 'F' && stringToHexify.charAt(hexIdx + 1) == 'F'))
                {
                        hexIdx += 2;
                }
                else if (!(stringToHexify.charAt(hexIdx) == '0' && stringToHexify.charAt(hexIdx + 1)
== 'E'))
                {
                        // We have a vaid single-byte pair of characters
                        tempHex += stringToHexify.substring(hexIdx, hexIdx + 2);
                        hexIdx += 2;
                }
                else
                {
                        // we've found a shift-in
                        // copy the "OE"
tempHex += stringToHexify.substring(hexIdx, hexIdx + 2);
                        hexIdx += 2;

                        // look for 00 and FF every fourth position until we find shift-out
                        shiftout = false;
                        while (!shiftout && hexIdx < len)
                        {
                                if (stringToHexify.charAt(hexIdx) == '0' && stringToHexify.charAt(hexIdx + 1) == 'F')
                                {
                                        shiftout = true;
                                        tempHex += stringToHexify.substring(hexIdx, hexIdx + 2);
                                        hexIdx += 2;
                                }
                                else if ((stringToHexify.charAt(hexIdx) == '0' && stringToHexify.charAt(hexIdx + 1) == '0') || (stringToHexify.charAt(hexIdx) == 'F' && stringToHexify.charAt(hexIdx + 1
) == 'F'))
                                {
                                        // don't copy any four byte sequence beginning with 00's orFF's
                                        hexIdx += 4;
                                }
                                else
                                {
                                        tempHex += stringToHexify.substring(hexIdx, hexIdx + 4);
                                        hexIdx += 4;
                                }
                        }
                }
        }
String hexedString = tempHex;
        if (hexedString != null && !hexedString.equals(""))
        {
                // hexify the string.

                len = hexedString.length();

          if ((len % 2) != 0)
          {
System.out.println("len%2");
                return null;
        }

                char hexStr[] = new char[len / 2];

                for (int i = 0; i < len; i += 2)
                {
                        hexStr[i / 2] = (char) Integer.parseInt(hexedString.substring(i, i + 2), 16);
                }

                hexedString = new String(hexStr);
        }
        return hexedString;
}
public String charCodeConvert(String hexedString, String defaultEnCode, String fromCode, String toCode)
{
        String convertedString = null;
        try
        {
                String conString = new String(hexedString.getBytes(defaultEnCode), fromCode);
                convertedString = new String(conString.getBytes(toCode));
        }
        catch (UnsupportedEncodingException ue)
        {
// throw new JavaException("CharacterConverter", ExceptionTypesEnum.ERROR, ue.toString());
                //ue.printStackTrace();
                System.out.println("UnsupportedEncodingException");
        }
        return convertedString;
}
public String charCodeConvert(String hexedString, String fromCode, String toCode)
{
        return charCodeConvert(hexedString, defaultCode, fromCode, toCode );
}

}

We suggested the following fix,:

in ext\i18n\src\share\sun\io\ByteToCharCp930.java file

change line 231 : from :
   "\uFF01\uFFE5\uFF0A\uFF09\uFF1B\uFFE2\uFF0D\uFF0F\uFFFD\uFFFD" + // 400 - 409
to :
  "\uFF01\uFFE5\uFF0A\uFF09\uFF1B\uFFE2\u2212\uFF0F\uFFFD\uFFFD" + // 400 - 409

The FFOD character was found to be a problem from the following website. : http://oss.software.ibm.com/pipermail/icu4c-support/2002-October/000757.html

======================================================================

backported by

JDK-2076635 Japanese characters not converting correctly from Codepage 930 to Codepage 943

Resolved

JDK-2076634 Japanese characters not converting correctly from Codepage 930 to Codepage 943

Closed

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates