Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7071819

To support Extended Grapheme Clusters in Regex

XMLWordPrintable

    • b106
    • generic
    • generic
    • Verified

      2.2 Extended Grapheme Clusters

      One or more Unicode characters may make up what the user thinks of as a character. To avoid ambiguity with the computer use of the term character, this is called a grapheme cluster. For example, "G" + acute-accent is a grapheme cluster: it is thought of as a single character by users, yet is actually represented by two Unicode characters. The Unicode Standard defines extended grapheme clusters that keep Hangul syllables together and do not break between base characters and combining marks. The precise definition is in UTR #29: Text Boundaries [UAX29]. These extended grapheme clusters are not the same as tailored grapheme clusters, which are covered in Level 3, Tailored Grapheme Clusters.

            sherman Xueming Shen
            sherman Xueming Shen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: