-
CSR
-
Resolution: Approved
-
P4
-
None
-
minimal
-
This a doc only update.
-
Java API
-
SE
Summary
Improve the java.util.Locale specification through wording, re-formatting, and upgraded content changes.
Problem
Over time, the java.util.Locale class specification has accumulated lots of patches. To help users better navigate the long class description, we should improve formatting and commentary. There is not one glaring issue this change seeks to tackle, but rather a lot of smaller changes that can help to improve the experience for the average user.
Solution
Improve the specification. It is not a goal to drastically change the intent of the specification. The vast majority of changes are wording/grammar improvements/corrections and improved formatting. These cosmetic changes are omitted for brevity.
The notable additions and removals are listed below,
Fix the Javadoc left hand side bar so that the appropriate headers are displayed, and not just Unicode locale/language extension
Mention where locale-sensitive APIs are mainly found: j.text, j.util (similar to JEP 252)
Briefly introduce the concepts of locale data and locale service providers, as well as warning of varying data from release to release
Better introduce BCP 47 language tags and highlight the differences between java.util.Locale and BCP 47
Rectify some incorrect commentary in variants field under the "Locale Composition" section. Multiple variants are separated via (''|'-'), not just ('').
Better describe the "Obtaining a Locale" section which now provides descriptive examples and warns against the deprecated constructors
Improve the "Use of Locale" section with more meaningful examples and better commentary
Update the "Compatibility" section to emphasize this wording is targeted towards users who want to maintain interoperability with older releases of the reference implementation. (We do this to help newer apps decide whether it is necessary to navigate this section). This section is now tagged with an implNote tag.
To help with trimming down the class description, remove the "Three-letter language/country(region) codes" paragraph from the "Compatibility" section. This wording describes why the specification was relaxed at some point regarding the deprecated constructors. It is very "ad-hoc" and ages poorly. Such behavior is already described in the constructors description and does not require a standalone paragraph.
Remove the incorrect Japanese/Croatia example from the getCountry() and getLocale() methods description. There is minimal benefit for describing a corner case with an incorrect hypothetical
Specification
All changes are captured in the following API diff link: https://cr.openjdk.org/~jlu/8341923_apidiff/java.base/java/util/Locale.html.
As stated in the solution section, cosmetic and other wording changes where the original intent was preserved are omitted for brevity. Below are the notable removals/additions. (Note that the following diff may not match the change-set diff 1 to 1, but is modified to make review easier).
Update the introductory paragraph to mention Locale Data/Locale Service Providers,
- * A {@code Locale} object represents a specific geographical, political,
- * or cultural region. An operation that requires a {@code Locale} to perform
- * its task is called <em>locale-sensitive</em> and uses the {@code Locale}
- * to tailor information for the user. For example, displaying a number
- * is a locale-sensitive operation— the number should be formatted
- * according to the customs and conventions of the user's native country,
- * region, or culture.
- *
+ * A {@code Locale} represents a specific geographical, political,
+ * or cultural region. An API that requires a {@code Locale} to perform
+ * its task is <em>locale-sensitive</em> and uses the {@code Locale}
+ * to tailor information for the user. These <em>locale-sensitive</em> APIs
+ * are principally in the <i>java.text</i> and <i>java.util</i> packages.
+ * For example, displaying a number is a <em>locale-sensitive</em> operation—
+ * the number should be formatted according to the customs and conventions of the
+ * user's native country, region, or culture.
... Each {@code Locale} is associated with locale data which is provided
+ * by the Java runtime environment or any deployed {@link
+ * java.util.spi.LocaleServiceProvider LocaleServiceProvider} implementations.
+ * The locale data provided by the Java runtime environment may vary by release.
In the "Locale Composition" section, better introduce BCP 47 with the following commentary,
+ * <h2 id="loc_comp">Locale Composition</h2>
+ * <p> A {@code Locale} is composed of the bolded fields described below; note that a
+ * {@code Locale} need not have all such fields. For example, {@link
+ * Locale#ENGLISH Locale.ENGLISH} is only comprised of the <em>language</em> field.
+ * In contrast, a {@code Locale} such as the one returned by {@code
+ * Locale.forLanguageTag("en-Latn-US-POSIX-u-nu-latn")} would be comprised of all
+ * the fields below. This particular {@code Locale} would represent English in
+ * the United States using the Latin script and numerics for use in POSIX
+ * environments.
+ * <p>
+ * {@code Locale} implements IETF BCP 47 and any deviations should be observed
+ * by the comments prefixed by <em>"BCP 47 deviation:"</em>.
+ * <a href="https://tools.ietf.org/html/rfc5646">RFC 5646</a>
+ * combines subtags from various ISO (639, 3166, 15924) standards which are also
+ * included in the composition of {@code Locale}.
+ * Additionally, the full list of valid codes for each field can be found in the
+ * <a href="https://www.iana.org/assignments/language-subtag-registry/language-subtag-registry">
+ * IANA Language Subtag Registry</a> (e.g. search for "Type: region").
Remedy incorrect variant field commentary under "Locale Composition",
- * <dd>Any arbitrary value used to indicate a variation of a
- * {@code Locale}. Where there are two or more variant values
- * each indicating its own semantics, these values should be ordered
- * by importance, with most important first, separated by
- * underscore('_').
+ * <dd> Any arbitrary value used to indicate a variation of a
+ * {@code Locale}. When multiple variants exist, they should be separated by
+ * {@code ('_'|'-')}. Variants of higher importance should precede the others
Provide additional commentary and example of a well-formed but not IANA defined subtag
- * <b>Note:</b> Although BCP 47 requires field values to be registered
+ * <b>BCP 47 deviation:</b> Although BCP 47 requires field values to be registered
* in the IANA Language Subtag Registry, the {@code Locale} class
- * does not provide any validation features. The {@code Builder}
+ * does not validate this requirement. For example, the variant code <em>"foobar"</em>
+ * is well-formed since it is composed of 5 to 8 alphanumerics, but is not defined
+ * the IANA Language Subtag Registry.
Update the Unicode Locale extension header with a more precise title
- * <h2><a id="def_locale_extension">Unicode locale/language extension</a></h2>
+ * <h3><a id="def_locale_extension">Unicode BCP 47 U Extension</a></h3>
Update the "Obtaining a Locale" section
- * <h3><a id="ObtainingLocale">Obtaining a Locale</a></h3>
- *
- * <p>There are several ways to obtain a {@code Locale}
- * object.
- *
- * <h4>Builder</h4>
- *
- * <p>Using {@link Builder} you can construct a {@code Locale} object
- * that conforms to BCP 47 syntax.
- *
- * <h4>Factory Methods</h4>
- *
- * <p>The method {@link #forLanguageTag} obtains a {@code Locale}
- * object for a well-formed BCP 47 language tag. The method
- * {@link #of(String, String, String)} and its overloads obtain a
- * {@code Locale} object from given {@code language}, {@code country},
- * and/or {@code variant} defined above.
- *
- * <h4>Locale Constants</h4>
- *
- * <p>The {@code Locale} class provides a number of convenient constants
- * that you can use to obtain {@code Locale} objects for commonly used
- * locales. For example, {@code Locale.US} is the {@code Locale} object
- * for the United States.
- *
+ * <h2><a id="ObtainingLocale">Obtaining a Locale</a></h2>
*
+ * <p>There are several ways to obtain a {@code Locale} object.
+ * It is advised against using the deprecated {@code Locale} constructors.
+ *
+ * <dl>
+ * <dt><b>Locale Constants</b></dt>
+ * <dd>A number of convenient constants are provided that return {@code Locale}
+ * objects for commonly used locales. For example, {@link #US Locale.US} is the
+ * {@code Locale} object for the United States.</dd>
+ * <dt><b>Factory Methods</b></dt>
+ * <dd>{@link #of(String, String, String) Locale::of} and its overloads obtain a
+ * {@code Locale} object from the given {@code language}, {@code country},
+ * and/or {@code variant}. {@link #forLanguageTag(String)} obtains a {@code Locale}
+ * object for a well-formed BCP 47 language tag.</dd>
+ * <dt><b>Builder</b></dt>
+ * <dd>{@link Builder} is used to construct a {@code Locale} object that conforms
+ * to BCP 47 syntax. Use a builder to enforce syntactic restrictions on the input.</dd>
+ * </dl>
+ * <p>The following invocations produce Locale objects that are all equivalent:
+ * {@snippet lang=java :
+ * Locale.US;
+ * Locale.of("en", "US");
+ * Locale.forLanguageTag("en-US");
+ * new Locale.Builder().setLanguage("en").setRegion("US").build();
+ * }
+ *
+ * <h2>Usage Examples</h2>
+ *
+ * <p>Once a {@code Locale} is {@linkplain ##ObtainingLocale obtained},
+ * it can be queried for information about itself. For example, use {@link
+ * #getCountry} to get the country (or region) code and {@link #getLanguage} to
+ * get the language. {@link #getDisplayCountry} can be used to get the
+ * name of the country suitable for displaying to the user. Similarly,
+ * use {@link #getDisplayLanguage()} to get the name of
+ * the language suitable for displaying to the user. The {@code getDisplayXXX}
+ * methods are themselves <em>locale-sensitive</em> and have two variants; one with an explicit
+ * locale parameter, and one without. The latter uses the default {@link
+ * Locale.Category#DISPLAY DISPLAY} locale, so the following are equivalent :
+ * {@snippet lang=java :
+ * Locale.getDefault().getDisplayCountry();
+ * Locale.getDefault().getDisplayCountry(Locale.getDefault(Locale.Category.DISPLAY));
+ * }
+ *
+ * <p>The Java Platform provides a number of classes that perform locale-sensitive
+ * operations. For example, the {@code NumberFormat} class formats
+ * numbers, currency, and percentages in a <em>locale-sensitive</em> manner. Classes such
+ * as {@code NumberFormat} have several factory methods for creating a default object
+ * of that type. These methods generally have two variants; one with an explicit
+ * locale parameter, and one without. The latter uses the default {@link
+ * Locale.Category#FORMAT FORMAT} locale, so the following are equivalent :
+ * {@snippet lang=java :
+ * NumberFormat.getCurrencyInstance();
+ * NumberFormat.getCurrencyInstance(Locale.getDefault(Locale.Category.FORMAT));
+ * }
+ *
+ * <p>
+ * The following example demonstrates <em>locale-sensitive</em> currency and
+ * date related operations under different locales :
+ * {@snippet lang = java:
+ * var number = 1000;
+ * NumberFormat.getCurrencyInstance(Locale.US).format(number); // returns "$1,000.00"
+ * NumberFormat.getCurrencyInstance(Locale.JAPAN).format(number); // returns "\u00A51,000""
+ * var date = LocalDate.of(2024, 1, 1);
+ * DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(Locale.US).format(date); // returns "January 1, 2024"
+ * DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(Locale.JAPAN).format(date); // returns "2024\u5e741\u67081\u65e5"
+ * }
+ *
Improve the "Usage Examples" section,
+ * <h2>Usage Examples</h2>
+ *
+ * <p>Once a {@code Locale} is {@linkplain ##ObtainingLocale obtained},
+ * it can be queried for information about itself. For example, use {@link
+ * #getCountry} to get the country (or region) code and {@link #getLanguage} to
+ * get the language. {@link #getDisplayCountry} can be used to get the
+ * name of the country suitable for displaying to the user. Similarly,
+ * use {@link #getDisplayLanguage()} to get the name of
+ * the language suitable for displaying to the user. The {@code getDisplayXXX}
+ * methods are themselves <em>locale-sensitive</em> and have two variants; one with an explicit
+ * locale parameter, and one without. The latter uses the default {@link
+ * Locale.Category#DISPLAY DISPLAY} locale :
+ * {@snippet lang=java :
+ * // The following are equivalent
+ * Locale.getDefault().getDisplayCountry();
+ * Locale.getDefault().getDisplayCountry(Locale.getDefault(Locale.Category.DISPLAY));
+ * }
*
+ * <p>The Java Platform provides a number of classes that perform locale-sensitive
+ * operations. For example, the {@code NumberFormat} class formats
+ * numbers, currency, and percentages in a <em>locale-sensitive</em> manner. Classes such
+ * as {@code NumberFormat} have several factory methods for creating a default object
+ * of that type. These methods generally have two variants; one with an explicit
+ * locale parameter, and one without. The latter uses the default {@link
+ * Locale.Category#FORMAT FORMAT} locale :
+ * {@snippet lang=java :
+ * // The following are equivalent
+ * NumberFormat.getCurrencyInstance();
+ * NumberFormat.getCurrencyInstance(Locale.getDefault(Locale.Category.FORMAT));
+ * }
+ *
+ * <p>
+ * The following example demonstrates <em>locale-sensitive</em> operations with different locales :
+ * {@snippet lang = java:
+ * // Localized currency format
+ * var number = 1000;
+ * NumberFormat.getCurrencyInstance(Locale.US).format(number); // returns "$1,000.00"
+ * NumberFormat.getCurrencyInstance(Locale.JAPAN).format(number); // returns "\u00A51,000""
+ * // Localized date format
+ * var date = LocalDate.of(2024, 1, 1);
+ * DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(Locale.US).format(date); // returns "January 1, 2024"
+ * DateTimeFormatter.ofLocalizedDate(FormatStyle.LONG).localizedBy(Locale.JAPAN).format(date); // returns "2024\u5e741\u67081\u65e5"
+ * }
+ *
- * <p>Once you've obtained a {@code Locale} you can query it for information
- * about itself. Use {@code getCountry} to get the country (or region)
- * code and {@code getLanguage} to get the language code.
- * You can use {@code getDisplayCountry} to get the
- * name of the country suitable for displaying to the user. Similarly,
- * you can use {@code getDisplayLanguage} to get the name of
- * the language suitable for displaying to the user. Interestingly,
- * the {@code getDisplayXXX} methods are themselves locale-sensitive
- * and have two versions: one that uses the default
- * {@link Locale.Category#DISPLAY DISPLAY} locale and one
- * that uses the locale specified as an argument.
*
- * <p>The Java Platform provides a number of classes that perform locale-sensitive
- * operations. For example, the {@code NumberFormat} class formats
- * numbers, currency, and percentages in a locale-sensitive manner. Classes
- * such as {@code NumberFormat} have several convenience methods
- * for creating a default object of that type. For example, the
- * {@code NumberFormat} class provides these three convenience methods
- * for creating a default {@code NumberFormat} object:
- * {@snippet lang=java :
- * NumberFormat.getInstance();
- * NumberFormat.getCurrencyInstance();
- * NumberFormat.getPercentInstance();
- * }
- * Each of these methods has two variants; one with an explicit locale
- * and one without; the latter uses the default
- * {@link Locale.Category#FORMAT FORMAT} locale:
- * {@snippet lang=java :
- * NumberFormat.getInstance(myLocale);
- * NumberFormat.getCurrencyInstance(myLocale);
- * NumberFormat.getPercentInstance(myLocale);
- * }
- * A {@code Locale} is the mechanism for identifying the kind of object
- * ({@code NumberFormat}) that you would like to get. The locale is
- * <STRONG>just</STRONG> a mechanism for identifying objects,
- * <STRONG>not</STRONG> a container for the objects themselves.
Update the Compatibility section with an implementation note tag and the following wording,
+ * @implNote
+ * <h2>Compatibility</h2>
+ * <p> The following commentary is provided for apps that want to ensure
+ * interoperability with older releases of {@code Locale} provided by the
+ * reference implementation.
In the Compatibility section, also remove the following paragraph regarding the deprecated constructor behavior,
- * <h4>Three-letter language/country(region) codes</h4>
- *
- * <p>The Locale constructors have always specified that the language
- * and the country param be two characters in length, although in
- * practice they have accepted any length. The specification has now
- * been relaxed to allow language codes of two to eight characters and
- * country (region) codes of two to three characters, and in
- * particular, three-letter language codes and three-digit region
- * codes as specified in the IANA Language Subtag Registry. For
- * compatibility, the implementation still does not impose a length
- * constraint.
- *
In the getDisplayLanguage()
and getDisplayCountry()
methods, remove the following,
* If the name returned cannot be localized for the default
* {@link Locale.Category#DISPLAY DISPLAY} locale,
- * (say, we don't have a Japanese name for Croatian),
* this function falls back on the English name, and uses the ISO code as a last-resort
* value. If the locale doesn't specify a language, this function returns the empty string.
In the getDisplayLanguage(Locale)
and getDisplayCountry(Locale)
method overloads, remove the following,
* If the name returned cannot be localized according to inLocale.
- * (say, we don't have a Japanese name for Croatia),
* this function falls back on the English name, and finally
Drop outdated "installed" terminology regarding LocaleServiceProviders
in getAvailableLocales()
- * {@return an array of installed locales}
+ * {@return an array of available locales}
*
* The returned array represents the union of locales supported
- * by the Java runtime environment and by installed
+ * by the Java runtime environment and by deployed
Drop outdated "installed" terminology regarding LocaleServiceProviders
in availableLocales()
- * {@return a stream of installed locales}
+ * {@return a stream of available locales}
*
* The returned stream represents the union of locales supported
- * by the Java runtime environment and by installed
+ * by the Java runtime environment and by deployed
Add the Unicode LDML as an external specification link,
+ * @spec https://unicode.org/reports/tr35/
+ * Unicode Locale Data Markup Language
- csr of
-
JDK-8341923 java.util.Locale class specification improvements
-
- Resolved
-