Summary
Clarify behavior for BreakIterator instances when text has not been set and a boundary searching operation is called.
Problem
The abstract class java.text.BreakIterator
can call an instance method which searches for a boundary in the text, when neither of the setText()
methods have been called.
For example, BreakIterator.getWordInstance().next();
has an ambiguous result. Is the result an exception, 0, -1 (BreakIterator.DONE), or something else?
The current specification does not explain the outcome in such a case.
Solution
Document the actual behavior, that is, the default implementations simply initialize an empty StringCharacterIterator and any subsequent boundary operations are performed on an empty string. This change is added to the class description using an @implNote tag. It would be redundant to add the behavior to each boundary searching method.
Specification
--- a/src/java.base/share/classes/java/text/BreakIterator.java
+++ b/src/java.base/share/classes/java/text/BreakIterator.java
* @implSpec The default implementation of the character boundary analysis
* conforms to the Unicode Consortium's Extended Grapheme Cluster breaks.
* For more detail, refer to
* <a href="https://www.unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries">
* Grapheme Cluster Boundaries</a> section in the Unicode Standard Annex #29.
*
- * <p>
+ * @implNote The default implementations of {@code BreakIterator} will perform the equivalent
+ * of calling {@code setText("")} if the text hasn't been set by either
+ * {@link #setText(String)} or {@link #setText(CharacterIterator)}
+ * and a boundary searching operation is called by the {@code BreakIterator} instance.
* The {@code BreakIterator} instances returned by the factory methods
* of this class are intended for use with natural languages only, not for
* programming language text. It is however possible to define subclasses
- csr of
-
JDK-6333341 [BI] Doc: java.text.BreakIterator class specification is unclear
-
- Resolved
-