Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4143071

Sentence BreakIterator has trouble with sentence which end with a date/number

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.1.6
    • core-libs
    • sparc
    • solaris_2.6



      Name: tb29552 Date: 05/27/98


      A BreakIterator retrieved by getSentenceInstance()
      doesn't find a sentence break if the sentence ends
      with a number. For example, there is only a single
      sentence in the following. There should be three:

       "Today is the 27th of May, 1998. Tomorrow will be 28 May 1998. The day after will be the 30th."

      Here's a simple program:

      import java.text.BreakIterator;

      public class bug
      {
        public static void main(String args[])
        {
          int pos = 0;
          String theText = "Today is the 27th of May, 1998. Tomorrow will be 28 May 1998. The day after will be the 30th.";
          
          BreakIterator breaks = BreakIterator.getSentenceInstance();
          breaks.setText(theText);
          while (pos != BreakIterator.DONE) {
            pos = breaks.next();
            System.out.println("Sentence Break at " + pos);
          }
        }
      }


      The output is
      Sentence Break at 93
      Sentence Break at -1


      I'd like to see
      Sentence Break at 31
      Sentence Break at 61
      Sentence Break at 93
      Sentence Break at -1

      This has been tested using default Locales of en
      and en_AU.
      (Review ID: 32445)
      ======================================================================

            rgillamsunw Richard Gillam (Inactive)
            tbell Tim Bell
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: