Summary
Allow loose matching of space separators for both java.time.format
and java.text
date/time formatters in the lenient parsing mode.
Problem
JDK20 upgraded the CLDR version to 42 in which they replaced ASCII spaces (U+0020
) between time and the am/pm marker with NNBSP
(Narrow No-Break Space, U+202F
) in English locales. Thus the localized parsers will throw an exception on parsing in-between ASCII spaces in the input text. This change broke some applications, although this is the expected behavior (JDK-8304925). Since NNBSP cannot be distinguished visually nor can be input easily it is not practical for applications to require input of NNBSP (JDK-8324308). To work around this issue, JDK's parsers should loosen the parsing.
Solution
The CLDR spec suggests Loose Matching of characters in Zs
category (Character.SPACE_SEPARATOR
) so that the differences between ASCII spaces and other space separators, including NNBSP, may be ignored. Since both date/time parsers in java.time.format
and java.text
have the concept of the lenient
parsing, those parsers can parse all space separators equally in their lenient
parsing mode.
In java.time.format
package, the default parsing mode is strict
, thus applications will need to explicitly set the leniency by calling DateTimeFormatterBuilder.parseLenient()
, such as:
var dtf = new DateTimeFormatterBuilder()
.parseLenient()
.append(DateTimeFormatter.ofLocalizedTime(FormatStyle.SHORT))
.toFormatter(Locale.ENGLISH);
In java.text
package, the default parsing mode is lenient
, thus applications will be able to parse all space separators automatically (thus behavior changes by default). In the cases they need to strictly parse the text, they can do:
var df = DateFormat.getTimeInstance(DateFormat.SHORT, Locale.ENGLISH);
df.setLenient(false);
Specification
In java.time.formatter.DateTimeFormatterBuilder.parseLenient()
, add the following:
+ * @implSpec A {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR} in the input
+ * text will match any other {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR}s
+ * in the pattern with the lenient parse style.
In java.time.formatter.DateTimeFormatterBuilder.parseStrict()
, add the following:
+ * @implSpec A {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR} in the input
+ * text will not match any other {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR}s
+ * in the pattern with the strict parse style.
In java.text.DateFormat.setLenient(boolean)
, add the following:
+ * @implSpec A {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR} in the input
+ * text will match any other {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR}s
+ * in the pattern with lenient parsing; otherwise, it will not match.
- csr of
-
JDK-8324665 Loose matching of space separators in the lenient date/time parsing mode
- Resolved