Differences between JRE and CLDR locale providers cause an application to fail.
With the application below the behavior is different on JDK 8 and JDK 11+.
The full report is:
https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-August/068355.html
public class Main {
public static void main(String[] args) throws IOException {
// System.setProperty("java.locale.providers", "JRE");
System.out.println(getPriceInCents(Locale.GERMANY, "9,99 €"));
}
static int getPriceInCents(Locale locale, String price) {
try {
DecimalFormat format = (DecimalFormat) NumberFormat.getCurrencyInstance(locale);
Number number = format.parse(price);
return (int) (number.doubleValue() * 100);
} catch (ParseException e) {
// This should be thrown on JDK 9+
System.out.println(e);
}
return 0;
}
}
After some digging I think this is caused by the changes done forJDK-8008577[1].
When I change the java.locale.providers property to "JRE" for example, it works again.
My investigations so far revealed that apparently the CLDR number pattern for the currency slightly differs.
I created breakpoints in sun.util.locale.provider.NumberFormatProviderImpl::getInstance() to display some things:
LocaleProviderAdapter adapter = LocaleProviderAdapter.forType(type);
String[] numberPatterns = adapter.getLocaleResources(override).getNumberPatterns();
DecimalFormatSymbols symbols = DecimalFormatSymbols.getInstance(override);
int entry = (choice == INTEGERSTYLE) ? NUMBERSTYLE : choice;
DecimalFormat format = new DecimalFormat(numberPatterns[entry], symbols);
// CLDR (type)
// #,##0.00 ¤ (numberPatterns[entry])
// [35,44,35,35,48,46,48,48,-62,-96,-62,-92] (numberPatterns[entry] in bytes)
//
// JRE type
// #,##0.00 ¤;-#,##0.00 ¤ (numberPatterns[entry])
// [35,44,35,35,48,46,48,48,32,-62,-92,59,45,35,44,35,35,48,46,48,48,32,-62,-92] (numberPatterns[entry] in bytes)
The JRE one includes the negative pattern, but the more interesting bit is that apparently the spacing differs here.
For JRE it seems to be a normal space (the 32), but for CLDR it's showing [-62, -96] which seems to be a non breaking space aka nbsp.
Ultimately this leads to a check failing in DecimalFormat when parsing the string "9,99 €" that obviously includes a normal space.
if (gotPositive) {
// the regionMatches will return false because nbsp != space
gotPositive = text.regionMatches(position,positiveSuffix,0,
positiveSuffix.length());
}
Which itself leads to the following in our case:
// fail if neither or both
if (gotPositive == gotNegative) {
parsePosition.errorIndex = position;
// We hit this part here which causes the parsing to fail
return false;
}
There are workarounds - e.g. by setting java.locale.providers as already mentioned or setting format.setPositiveSuffix(" €"); to fix this particular case.
Is this a bug or a feature or are we missing something?
In case this is an actual bug we would appreciate a "reported-by" mentioning in an eventual fix.
Thanks in advance. I do hope you can follow my thoughts in this email.
[1] https://bugs.openjdk.java.net/browse/JDK-8008577
With the application below the behavior is different on JDK 8 and JDK 11+.
The full report is:
https://mail.openjdk.java.net/pipermail/core-libs-dev/2020-August/068355.html
public class Main {
public static void main(String[] args) throws IOException {
// System.setProperty("java.locale.providers", "JRE");
System.out.println(getPriceInCents(Locale.GERMANY, "9,99 €"));
}
static int getPriceInCents(Locale locale, String price) {
try {
DecimalFormat format = (DecimalFormat) NumberFormat.getCurrencyInstance(locale);
Number number = format.parse(price);
return (int) (number.doubleValue() * 100);
} catch (ParseException e) {
// This should be thrown on JDK 9+
System.out.println(e);
}
return 0;
}
}
After some digging I think this is caused by the changes done for
When I change the java.locale.providers property to "JRE" for example, it works again.
My investigations so far revealed that apparently the CLDR number pattern for the currency slightly differs.
I created breakpoints in sun.util.locale.provider.NumberFormatProviderImpl::getInstance() to display some things:
LocaleProviderAdapter adapter = LocaleProviderAdapter.forType(type);
String[] numberPatterns = adapter.getLocaleResources(override).getNumberPatterns();
DecimalFormatSymbols symbols = DecimalFormatSymbols.getInstance(override);
int entry = (choice == INTEGERSTYLE) ? NUMBERSTYLE : choice;
DecimalFormat format = new DecimalFormat(numberPatterns[entry], symbols);
// CLDR (type)
// #,##0.00 ¤ (numberPatterns[entry])
// [35,44,35,35,48,46,48,48,-62,-96,-62,-92] (numberPatterns[entry] in bytes)
//
// JRE type
// #,##0.00 ¤;-#,##0.00 ¤ (numberPatterns[entry])
// [35,44,35,35,48,46,48,48,32,-62,-92,59,45,35,44,35,35,48,46,48,48,32,-62,-92] (numberPatterns[entry] in bytes)
The JRE one includes the negative pattern, but the more interesting bit is that apparently the spacing differs here.
For JRE it seems to be a normal space (the 32), but for CLDR it's showing [-62, -96] which seems to be a non breaking space aka nbsp.
Ultimately this leads to a check failing in DecimalFormat when parsing the string "9,99 €" that obviously includes a normal space.
if (gotPositive) {
// the regionMatches will return false because nbsp != space
gotPositive = text.regionMatches(position,positiveSuffix,0,
positiveSuffix.length());
}
Which itself leads to the following in our case:
// fail if neither or both
if (gotPositive == gotNegative) {
parsePosition.errorIndex = position;
// We hit this part here which causes the parsing to fail
return false;
}
There are workarounds - e.g. by setting java.locale.providers as already mentioned or setting format.setPositiveSuffix(" €"); to fix this particular case.
Is this a bug or a feature or are we missing something?
In case this is an actual bug we would appreciate a "reported-by" mentioning in an eventual fix.
Thanks in advance. I do hope you can follow my thoughts in this email.
[1] https://bugs.openjdk.java.net/browse/JDK-8008577
- relates to
-
JDK-8008577 Use CLDR Locale Data by Default
-
- Closed
-