Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4214367

BreakIterator returns wrong Japanese word boundaries.

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P4 P4
    • 1.3.0
    • 1.2.0
    • core-libs
    • kestrel
    • generic
    • generic



      Name: clC74495 Date: 02/24/99


      BreakIterator returns wrong word boundaries for the Japanese words
      which include the following characters:

      U+309D HIRAGANA ITERATION MARK
      U+309E HIRAGANA VOICED ITERATION MARK

      U+30FD KATAKANA ITERATION MARK
      U+30FE KATAKANA VOICED ITERATION MARK

      U+30FC KATAKANA-HIRAGANA PROLONGED SOUND MARK

      This problem occurs because the java.text.WordBreakData lacks
      these character mappings. If you add the following differences,
      it will work well.

      *** WordBreakData.java.old Mon Feb 1 20:06:33 1999
      --- WordBreakData.java Mon Feb 1 20:11:31 1999
      ***************
      *** 314,321 ****
      --- 314,327 ----
                new SpecialMapping(HIRAGANA_LETTER_SMALL_A, HIRAGANA_LETTER_VU, hira),
                new SpecialMapping(COMBINING_KATAKANA_HIRAGANA_VOICED_SOUND_MARK,
                                   HIRAGANA_SEMIVOICED_SOUND_MARK, diacrit),
      + new SpecialMapping(HIRAGANA_ITERATION_MARK,
      + HIRAGANA_VOICED_ITERATION_MARK, hira),
                new SpecialMapping(KATAKANA_LETTER_SMALL_A,
                                   KATAKANA_LETTER_SMALL_KE, kata),
      + new SpecialMapping(KATAKANA_HIRAGANA_PROLONGED_SOUND_MARK,
      + diacrit),
      + new SpecialMapping(KATAKANA_ITERATION_MARK,
      + KATAKANA_VOICED_ITERATION_MARK, kata),
                new SpecialMapping(UNICODE_LOW_BOUND_HAN,
                                   UNICODE_HIGH_BOUND_HAN, kanji),
                new SpecialMapping(HANGUL_SYL_LOW, HANGUL_SYL_HIGH, letter),
      (Review ID: 54644)
      ======================================================================

            rgillamsunw Richard Gillam (Inactive)
            clucasius Carlos Lucasius (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: