Loading...

XML

Word

Printable

Type: CSR
Resolution: Approved
Priority: P4
Fix Version/s: 23
Component/s: core-libs
Labels:
None

Subcomponent:
java.time
Compatibility Kind:

behavioral
Compatibility Risk:
low
Compatibility Risk Description:

Hide
Although it would not be common, applications that distinguish ASCII space and other space separators will see behavioral changes in the `lenient` mode. Even in that case, it can be avoided by choosing the `strict` mode, which retains the original behavior.

Show
Although it would not be common, applications that distinguish ASCII space and other space separators will see behavioral changes in the `lenient` mode. Even in that case, it can be avoided by choosing the `strict` mode, which retains the original behavior.
Interface Kind:

Java API
Scope:
SE

Summary

Allow loose matching of space separators for both java.time.format and java.text date/time formatters in the lenient parsing mode.

Problem

JDK20 upgraded the CLDR version to 42 in which they replaced ASCII spaces (U+0020) between time and the am/pm marker with NNBSP (Narrow No-Break Space, U+202F) in English locales. Thus the localized parsers will throw an exception on parsing in-between ASCII spaces in the input text. This change broke some applications, although this is the expected behavior (JDK-8304925). Since NNBSP cannot be distinguished visually nor can be input easily it is not practical for applications to require input of NNBSP (JDK-8324308). To work around this issue, JDK's parsers should loosen the parsing.

Solution

The CLDR spec suggests Loose Matching of characters in Zs category (Character.SPACE_SEPARATOR) so that the differences between ASCII spaces and other space separators, including NNBSP, may be ignored. Since both date/time parsers in java.time.format and java.text have the concept of the lenient parsing, those parsers can parse all space separators equally in their lenient parsing mode.

In java.time.format package, the default parsing mode is strict, thus applications will need to explicitly set the leniency by calling DateTimeFormatterBuilder.parseLenient(), such as:

    var dtf = new DateTimeFormatterBuilder()
        .parseLenient()
        .append(DateTimeFormatter.ofLocalizedTime(FormatStyle.SHORT))
        .toFormatter(Locale.ENGLISH);

In java.text package, the default parsing mode is lenient, thus applications will be able to parse all space separators automatically (thus behavior changes by default). In the cases they need to strictly parse the text, they can do:

    var df = DateFormat.getTimeInstance(DateFormat.SHORT, Locale.ENGLISH);
    df.setLenient(false);

Specification

In java.time.formatter.DateTimeFormatterBuilder.parseLenient(), add the following:

+     * @implSpec A {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR} in the input
+     * text will match any other {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR}s
+     * in the pattern with the lenient parse style.

In java.time.formatter.DateTimeFormatterBuilder.parseStrict(), add the following:

+     * @implSpec A {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR} in the input
+     * text will not match any other {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR}s
+     * in the pattern with the strict parse style.

In java.text.DateFormat.setLenient(boolean), add the following:

+     * @implSpec A {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR} in the input
+     * text will match any other {@link Character#SPACE_SEPARATOR SPACE_SEPARATOR}s
+     * in the pattern with lenient parsing; otherwise, it will not match.

csr of

JDK-8324665 Loose matching of space separators in the lenient date/time parsing mode

Resolved

Assignee:: Naoto Sato

Reporter:: Naoto Sato

Reviewed By:: Joe Wang, Roger Riggs

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2024-01-30 09:46

Updated:: 2024-02-05 13:23

Resolved:: 2024-02-05 13:23

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates