Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4098467

BreakIterator doesn't handle conjoining Hangul jamo

XMLWordPrintable

    • 1.1.6
    • x86
    • windows_nt
    • Verified



        Name: bb33257 Date: 12/10/97


        Korean text made up to conjoining Hangul jamo (rather than
        precomposed Hangul syllables) isn't treated properly by the
        iterator returned by BreakIterator.getCharacterInstance().

        It treats each jamo element as a character, rather than treating
        whole syllables as characters.

        Try the following test text:

        \u1109\u1161\u11bc
        \u1112\u1161\u11bc
        <space>
        \u1112\u1161\u11ab
        \u110b\u1175\u11ab
        <space>
        \u110b\u1167\u11ab
        \u1112\u1161\u11b8
        <space>
        \u110c\u1161\u11bc
        \u1105\u1169
        \u1100\u116d
        \u1112\u116c

        The breaks should be where the line divisions in the preceding
        example are ("<space>" denotes an ASCII space). Right now, there's
        a character break after every Unicode character.
        ======================================================================

              joconnersunw John Oconner (Inactive)
              bcbeck Brian Beck (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: