-
Enhancement
-
Resolution: Fixed
-
P4
-
7
-
b106
-
generic
-
generic
-
Verified
2.2 Extended Grapheme Clusters
One or more Unicode characters may make up what the user thinks of as a character. To avoid ambiguity with the computer use of the term character, this is called a grapheme cluster. For example, "G" + acute-accent is a grapheme cluster: it is thought of as a single character by users, yet is actually represented by two Unicode characters. The Unicode Standard defines extended grapheme clusters that keep Hangul syllables together and do not break between base characters and combining marks. The precise definition is in UTR #29: Text Boundaries [UAX29]. These extended grapheme clusters are not the same as tailored grapheme clusters, which are covered in Level 3, Tailored Grapheme Clusters.
One or more Unicode characters may make up what the user thinks of as a character. To avoid ambiguity with the computer use of the term character, this is called a grapheme cluster. For example, "G" + acute-accent is a grapheme cluster: it is thought of as a single character by users, yet is actually represented by two Unicode characters. The Unicode Standard defines extended grapheme clusters that keep Hangul syllables together and do not break between base characters and combining marks. The precise definition is in UTR #29: Text Boundaries [UAX29]. These extended grapheme clusters are not the same as tailored grapheme clusters, which are covered in Level 3, Tailored Grapheme Clusters.
- relates to
-
JDK-8222978 Upgrade the extended grapheme cluster support to the latest Unicode level.
-
- Resolved
-
-
JDK-8149787 test/java/util/regex/GraphemeTest.java source file has non-ascii character u+00f7
-
- Resolved
-
-
JDK-8046101 JEP 111: Additional Unicode Constructs for Regular Expressions
-
- Candidate
-