Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4087096

rfe:java.text.BreakIterator needs more functionality for handling word breaks

    XMLWordPrintable

Details

    • 1.2beta4
    • x86
    • windows_95
    • Not verified

    Description



      Name: joT67522 Date: 10/17/97


      I was really pleased when I first saw the
      BreakIterator class, as I was hoping I would not
      have to write my own wordbreak logic again for
      the umpteenth time. Unfortunately, if you try to
      use the class to implement the full complement of
      word movement functions, you quickly find that
      there are some fundamental pieces missing.

      For instance: I need to write a nextWord() method.
      However, calling BreakIterator.next() does NOT
      move you to the beginning of the next word -- it
      moves you to the next word boundary. This may in
      fact be the start of a word, but it may also be the
      end of a word, depending on where you started
      from. This in itself wouldn't be too big a deal,
      as long as there were a way to determine which edge
      you're currently on ... but there isn't. You have
      to resort to looking at character classifications,
      which is ultimately doomed to failure, especially
      when you start dealing with Chinese/Japanese/
      Korean text.

      In the same vein, I also need to implement
      startOfWord(), endOfWord(), isStartOfWord(),
      and isEndOfWord(). As far as I can tell, there is
      no way to concoct these using the tools provided
      by BreakIterator.
      company - Lotus Development , email - ###@###.###
      ======================================================================

      Attachments

        Activity

          People

            rgillamsunw Richard Gillam (Inactive)
            johsunw Joon Oh (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: