- 
    Type:
Bug
 - 
    Resolution: Not an Issue
 - 
    Priority:
  P4                     
     - 
    None
 - 
    Affects Version/s: 6
 - 
    Component/s: core-libs
 
- 
        x86
 - 
        windows_xp
 
                    FULL PRODUCT VERSION :
1.6.0_04
ADDITIONAL OS VERSION INFORMATION :
Windows XP SP2
A DESCRIPTION OF THE PROBLEM :
In the third panel of the intl.cpl control panel on Windows XP, you can set the default code page to use for "non-Unicode" programs.
For example, this controls (for all programs, including those that are fully Unicode internally) how a text file with no encoding information is to be interpreted.
Starting with Java 5, Charset.defaultCharset() relies on the system property "file.encoding" to return such a value.
Unfortunately it doesn't return the correct value which can be retrieved using the WIN32 API method GetACP().
Instead the implementation (GetJavaProperties) in java_props_md.c confuses locale with "default encoding" and tries to return a code page matching the default locale.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Set the "Language for non-Unicode programs" on a US Windows to Russian and reboot.
Open Notepad, paste some russian text into it and save as a (non-Unicode) .txt file. Reopen it in Notepad and see that it looks OK.
Now write a small Java app that reads the file using Charset.defaultCharset() and inspect the contents in Unicode.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected java code to behave as native windows applications do.
ACTUAL -
The russian code page 1251 characters are interpreted as being in the Windows-1252 character set.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
Charset cs = Charset.defaultCharset(); // Should match GetACP but doesn't
FileInputStream fis = new FileInputStream(fileName);
InputStreamReader isr = new InputStreamReader(fis, cs);
BufferedReader br = new BufferedReader(isr);
while (true) {
String s = br.readLine();
if (s == null) break;
System.out.println(s);
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Only workaround I have found is to write JNI code which calls GetACP and returns "cp" + the code page retrieved.
            
1.6.0_04
ADDITIONAL OS VERSION INFORMATION :
Windows XP SP2
A DESCRIPTION OF THE PROBLEM :
In the third panel of the intl.cpl control panel on Windows XP, you can set the default code page to use for "non-Unicode" programs.
For example, this controls (for all programs, including those that are fully Unicode internally) how a text file with no encoding information is to be interpreted.
Starting with Java 5, Charset.defaultCharset() relies on the system property "file.encoding" to return such a value.
Unfortunately it doesn't return the correct value which can be retrieved using the WIN32 API method GetACP().
Instead the implementation (GetJavaProperties) in java_props_md.c confuses locale with "default encoding" and tries to return a code page matching the default locale.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Set the "Language for non-Unicode programs" on a US Windows to Russian and reboot.
Open Notepad, paste some russian text into it and save as a (non-Unicode) .txt file. Reopen it in Notepad and see that it looks OK.
Now write a small Java app that reads the file using Charset.defaultCharset() and inspect the contents in Unicode.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expected java code to behave as native windows applications do.
ACTUAL -
The russian code page 1251 characters are interpreted as being in the Windows-1252 character set.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
Charset cs = Charset.defaultCharset(); // Should match GetACP but doesn't
FileInputStream fis = new FileInputStream(fileName);
InputStreamReader isr = new InputStreamReader(fis, cs);
BufferedReader br = new BufferedReader(isr);
while (true) {
String s = br.readLine();
if (s == null) break;
System.out.println(s);
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Only workaround I have found is to write JNI code which calls GetACP and returns "cp" + the code page retrieved.