Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4095325

[BI] RFE: Need special word-break tables for Chinese

    XMLWordPrintable

Details

    • generic, x86, sparc
    • generic, solaris_2.5, windows_95

    Description

      Name: bb33257 Date: 11/25/97


      The word-break tables (i.e., the tables used by the BreakIterator
      returned by BreakIterator.getWordInstance()-- line-breaking
      tables are fine) treat CJK characters in a Japanese-specific way:
      an arbitrary run of Kanji characters, followed by an optional
      arbitrary run of Hiragana characters, followed by an optional
      arbitrary run of Katakana characters, all gets treated as a
      single "word" by the word-break iterator. However, in Chinese
      text, which doesn't use hiragana or katakana, this will result
      in whole paragraphs (instead of individual ideographs) being
      treated as "words" for the purposes of double-click selection
      and "find whole words" operations. Chinese will therefore
      require its own state tables for word breaking.
      ======================================================================

      Dictionary-based break iterators may also be needed for Korean and Japanese.
      ###@###.### 11/2/04 18:15 GMT

      Attachments

        Issue Links

          Activity

            People

              peytoia Yuka Kamiya (Inactive)
              bcbeck Brian Beck (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: