-
Type:
CSR
-
Resolution: Approved
-
Priority:
P4
-
Component/s: core-libs
-
None
-
behavioral
-
low
-
The behavior for the locales without those extensions should remain the same. If the user specifies the locale with those extensions, the behavior will change, but I believe it is a favorable change.
-
Java API
-
SE
Summary
Support Unicode extension for collation settings
Problem
Unicode Locale Data Markup Language (LDML) defines BCP 47 U extensions for collation settings. However, java.text.Collator instances created with java.text.Collator.getInstance(Locale) ignores the BCP 47 settings for strength (ks) and normalization (kk) specified in the given Locale instance, despite that they could be interpreted into the created instance.
Solution
If the Locale argument to the 1-arg factory method getInstance(Locale) contains ks and/or kk collation settings, the factory method will call setStrength(int) and/or setDecomposition(int) on the created Collator instance accordingly. Unrecognized/unmappable settings values, such as level4 in Unicode which is not supported by the JDK, the settings in the Collator instance are not modified.
Specification
Modify the method description of java.text.Collator.getInstance(Locale) as follows:
/**
- * Gets the Collator for the desired locale.
+ * Gets the Collator for the desired locale. If the desired locale
+ * has the "{@code ks}" and/or the "{@code kk}"
+ * <a href="https://www.unicode.org/reports/tr35/tr35-collation.html#Setting_Options">
+ * Unicode collation settings</a>, this method will call {@linkplain #setStrength(int)}
+ * and/or {@linkplain #setDecomposition(int)} on the created instance, if the specified
+ * Unicode collation settings are recognized based on the following mappings:
+ * <table class="striped">
+ * <caption style="display:none">Strength/Decomposition mappings</caption>
+ * <thead>
+ * <tr><th scope="col">BCP 47 values for strength (ks)</th>
+ * <th scope="col">Collator constants for strength</th></tr>
+ * </thead>
+ * <tbody>
+ * <tr><th scope="row" style="text-align:left">level1</th>
+ * <td>PRIMARY</td></tr>
+ * <tr><th scope="row" style="text-align:left">level2</th>
+ * <td>SECONDARY</td></tr>
+ * <tr><th scope="row" style="text-align:left">level3</th>
+ * <td>TERTIARY</td></tr>
+ * <tr><th scope="row" style="text-align:left">identic</th>
+ * <td>IDENTICAL</td></tr>
+ * </tbody>
+ * <thead>
+ * <tr><th scope="col">BCP 47 values for normalization (kk)</th>
+ * <th scope="col">Collator constants for decomposition</th></tr>
+ * </thead>
+ * <tbody>
+ * <tr><th scope="row" style="text-align:left">true</th>
+ * <td>CANONICAL_DECOMPOSITION</td></tr>
+ * <tr><th scope="row" style="text-align:left">false</th>
+ * <td>NO_DECOMPOSITION</td></tr>
+ * </tbody>
+ * </table>
+ * If the specified setting value is not recognized, the strength and/or
+ * decomposition will not be overridden, as if there were no BCP 47 collation
+ * options in the desired locale.
+ *
* @apiNote Implementations of {@code Collator} class may produce
Attached the image of the table for convenience.
- csr of
-
JDK-8308108 Support Unicode extension for collation settings
-
- Resolved
-