Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P3
Fix Version/s: 8
Affects Version/s: 7
Component/s: core-libs
Labels:
- regression
- webbug

Subcomponent:
java.lang
Introduced In Build:
b55
Introduced In Version:

7
Resolved In Build:
b115
CPU:

generic
OS:

generic
Verification:
Verified

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8027426	7u60	Yuka Kamiya	P3	Closed	Fixed	b01

FULL PRODUCT VERSION :
java version " 1.7.0_25 "
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

ADDITIONAL OS VERSION INFORMATION :
3.8.0-25-generic #37-Ubuntu SMP Thu Jun 6 20:47:07 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

A DESCRIPTION OF THE PROBLEM :
The problem does not happen when the test is run in Turkish locale.
In order to reproduce the problem, the locale should be set to English (or probably any non-Turkish locale)

In English locale, if a string with dotted-capital-I (Turkish-I, \u0130) character is converted to lower case, using toLoweCase method, an extra (and invalid) character is added to the resulting string just after the Turkish-I character.

REGRESSION. Last worked in version 6u45

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
       String stringWithDottedI = " \u0130 " ;
        Locale.setDefault(new Locale( " en " , " US " ));

        String lowerCasedString = stringWithDottedI.toLowerCase();

        assertEquals(stringWithDottedI.length(), lowerCasedString.length());

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
lowerCasedString.length() == 1
ACTUAL -
lowerCasedString.length() == 2

REPRODUCIBILITY :
This bug can be reproduced always.

---------- BEGIN SOURCE ----------
package test;

import org.junit.Before;
import org.junit.Test;

import java.util.Locale;

import static org.junit.Assert.assertEquals;

public class StringTest {
    private final String stringWithDottedI = " \u0130 " ;

    @Before
    public void setup() {
    }

    @Test
    public void testWhenLocaleIsTurkish_lowerCasedStringShouldHaveSameLength() {
        Locale.setDefault(new Locale( " tr " , " TR " ));

        String lowerCasedString = stringWithDottedI.toLowerCase();

        assertEquals(stringWithDottedI.length(), lowerCasedString.length());
    }

    @Test
    public void testWhenLocaleIsEnglish_lowerCasedStringShouldHaveSameLength() {
        Locale.setDefault(new Locale( " en " , " US " ));

        String lowerCasedString = stringWithDottedI.toLowerCase();

        assertEquals(stringWithDottedI.length(), lowerCasedString.length());
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
set default locale to Turkish ( new Locale( " tr " , " TR " ) )
or call toLowerCase method which accepts a locale parameter and pass a Turkish locale parameter

backported by

JDK-8027426 String.toLowerCase incorrectly increases length, if string contains \u0130 char

Closed

relates to

JDK-8041791 String.toLowerCase regression - violates Unicode standard

Closed

JDK-6404304 RFE: Unicode 5.1 support

Closed

Assignee:: Yuka Kamiya (Inactive)

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 7 Start watching this issue

Created:: 2013-07-06 05:58

Updated:: 2014-04-24 17:28

Resolved:: 2013-10-21 14:19

Details

Backports

Description

Attachments

Issue Links

Activity

People

Dates