Details
Description
Summary
Support Unicode extension for collation settings
Problem
Unicode Locale Data Markup Language (LDML) defines BCP 47 U extensions for collation settings. However, java.text.Collator
instances created with java.text.Collator.getInstance(Locale)
ignores the BCP 47 settings for strength (ks)
and normalization (kk)
specified in the given Locale
instance, despite that they could be interpreted into the created instance.
Solution
If the Locale
argument to the 1-arg factory method getInstance(Locale)
contains ks
and/or kk
collation settings, the factory method will call setStrength(int)
and/or setDecomposition(int)
on the created Collator
instance accordingly. Unrecognized/unmappable settings values, such as level4
in Unicode which is not supported by the JDK, the settings in the Collator
instance are not modified.
Specification
Modify the method description of java.text.Collator.getInstance(Locale)
as follows:
/**
- * Gets the Collator for the desired locale.
+ * Gets the Collator for the desired locale. If the desired locale
+ * has the "{@code ks}" and/or the "{@code kk}"
+ * <a href="https://www.unicode.org/reports/tr35/tr35-collation.html#Setting_Options">
+ * Unicode collation settings</a>, this method will call {@linkplain #setStrength(int)}
+ * and/or {@linkplain #setDecomposition(int)} on the created instance, if the specified
+ * Unicode collation settings are recognized based on the following mappings:
+ * <table class="striped">
+ * <caption style="display:none">Strength/Decomposition mappings</caption>
+ * <thead>
+ * <tr><th scope="col">BCP 47 values for strength (ks)</th>
+ * <th scope="col">Collator constants for strength</th></tr>
+ * </thead>
+ * <tbody>
+ * <tr><th scope="row" style="text-align:left">level1</th>
+ * <td>PRIMARY</td></tr>
+ * <tr><th scope="row" style="text-align:left">level2</th>
+ * <td>SECONDARY</td></tr>
+ * <tr><th scope="row" style="text-align:left">level3</th>
+ * <td>TERTIARY</td></tr>
+ * <tr><th scope="row" style="text-align:left">identic</th>
+ * <td>IDENTICAL</td></tr>
+ * </tbody>
+ * <thead>
+ * <tr><th scope="col">BCP 47 values for normalization (kk)</th>
+ * <th scope="col">Collator constants for decomposition</th></tr>
+ * </thead>
+ * <tbody>
+ * <tr><th scope="row" style="text-align:left">true</th>
+ * <td>CANONICAL_DECOMPOSITION</td></tr>
+ * <tr><th scope="row" style="text-align:left">false</th>
+ * <td>NO_DECOMPOSITION</td></tr>
+ * </tbody>
+ * </table>
+ * If the specified setting value is not recognized, the strength and/or
+ * decomposition will not be overridden, as if there were no BCP 47 collation
+ * options in the desired locale.
+ *
* @apiNote Implementations of {@code Collator} class may produce
Attached the image of the table for convenience.
Attachments
Issue Links
- csr of
-
JDK-8308108 Support Unicode extension for collation settings
- Resolved