Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4098467

BreakIterator doesn't handle conjoining Hangul jamo

    XMLWordPrintable

Details

    • 1.1.6
    • x86
    • windows_nt
    • Verified

    Backports

      Description



        Name: bb33257 Date: 12/10/97


        Korean text made up to conjoining Hangul jamo (rather than
        precomposed Hangul syllables) isn't treated properly by the
        iterator returned by BreakIterator.getCharacterInstance().

        It treats each jamo element as a character, rather than treating
        whole syllables as characters.

        Try the following test text:

        \u1109\u1161\u11bc
        \u1112\u1161\u11bc
        <space>
        \u1112\u1161\u11ab
        \u110b\u1175\u11ab
        <space>
        \u110b\u1167\u11ab
        \u1112\u1161\u11b8
        <space>
        \u110c\u1161\u11bc
        \u1105\u1169
        \u1100\u116d
        \u1112\u116c

        The breaks should be where the line divisions in the preceding
        example are ("<space>" denotes an ASCII space). Right now, there's
        a character break after every Unicode character.
        ======================================================================

        Attachments

          Issue Links

            Activity

              People

                joconnersunw John Oconner (Inactive)
                bcbeck Brian Beck (Inactive)
                Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                  Created:
                  Updated:
                  Resolved:
                  Imported:
                  Indexed: