Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4242585

[BI] BreakIterator.getLineBreak() breaks on an embedded period (re-open 4097920)

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Won't Fix
    • Icon: P4 P4
    • None
    • 1.2.0, 1.2.2, 1.4.0
    • core-libs
    • generic, x86
    • generic, linux, windows_nt

      Name: krT82822 Date: 05/29/99


      BreakIterator.getLineBreak() will break "sun.com" right after
      the period. Periods with non-white space characters after them
      should not be considered to end a word or a sentence.

      For supporting evidence for this position, note that
      BreakIterator.getWordBreak() will return "sun.com" as one word.

      Apparently this was done correctly at one time, until 4097920
      was "fixed".

      --------------------

      (5/29/99 kevin.ryan@eng -- verified with excerpt from java.text.BreakIterator docs:)

      import java.text.*;
      import java.util.*;

      public class Yikes {

       public static void main(String args[]) {
                      String stringToExamine = "sun.com";
                      BreakIterator boundary = BreakIterator.getLineInstance(Locale.US);
                      boundary.setText(stringToExamine);
                      printEachForward(boundary, stringToExamine);
             }


             public static void printEachForward(BreakIterator myIter, String source) {
                 int start = myIter.first();
                 for (int end = myIter.next();
                      end != BreakIterator.DONE;
                      start = end, end = myIter.next()) {
                      System.out.println(source.substring(start,end));
                 }
             }

      }

      (Review ID: 83639)
      ======================================================================

      Name: boT120536 Date: 12/05/2000


      java version "1.2.2"
      Java HotSpot(TM) Server VM (2.0rc2, mixed mode, build D)

      When I try to use BreakIterator.getLineInstance, instead of breaking it up into
      lines, it behaves exactly the same as getWordInstance, i.e. breaking it up
      based on words.

      I need this to work because I'm trying to write a StackTrace to a log, and want
      to space the lines over after the first line of the StackTrace.

      Here is the source:

      public class EnetLog extends Object {
           public final static void convertToFill(String message)
           {
                System.out.println("break iterator: message is: " + message );
                BreakIterator theIterator = null;
                if (message.indexOf("line") > 0 ) { theIterator =
      BreakIterator.getLineInstance(); System.out.println("LINE"); }
                else if (message.indexOf("sentence") > 0 ) { theIterator =
      BreakIterator.getSentenceInstance(); System.out.println("SENTENCE"); }
                else if (message.indexOf("word") > 0 ) { theIterator =
      BreakIterator.getWordInstance(); System.out.println("WORD"); }
                else if (message.indexOf("character") > 0 ) { theIterator =
      BreakIterator.getCharacterInstance(); System.out.println("CHARACTER"); }
                else { theIterator = BreakIterator.getLineInstance();
      System.out.println("LINE"); }
                
                theIterator.setText(message);
                StringBuffer newMessage = new StringBuffer();
                
                int start = theIterator.first();
                for (int end = theIterator.next(); end != BreakIterator.DONE; start =
      end, end = theIterator.next())
                {
                     System.out.println("at break iterator, start: " + start + "
      end: " + end + " line is: " + message.substring(start,end) );
                }
           }

           public static void main (String[] args)
           {
              convertToFill("this is a test with line\n this is another test;\n this
      is the last test") ;
              convertToFill("this is a test with sentence") ;
              convertToFill("this is a test with word") ;
              convertToFill("this is a test with character") ;
           }
      }


      **************************************************
      Here is the output:
      D:\src\com\mot\hris\util>java com.mot.hris.util.EnetLog
      break iterator: message is: this is a test with line
       this is another test;
       this is the last test
      LINE
      at break iterator, start: 0 end: 5 line is: this
      at break iterator, start: 5 end: 8 line is: is
      at break iterator, start: 8 end: 10 line is: a
      at break iterator, start: 10 end: 15 line is: test
      at break iterator, start: 15 end: 20 line is: with
      at break iterator, start: 20 end: 25 line is: line

      at break iterator, start: 25 end: 26 line is:
      at break iterator, start: 26 end: 31 line is: this
      at break iterator, start: 31 end: 34 line is: is
      at break iterator, start: 34 end: 42 line is: another
      at break iterator, start: 42 end: 48 line is: test;

      at break iterator, start: 48 end: 49 line is:
      at break iterator, start: 49 end: 54 line is: this
      at break iterator, start: 54 end: 57 line is: is
      at break iterator, start: 57 end: 61 line is: the
      at break iterator, start: 61 end: 66 line is: last
      at break iterator, start: 66 end: 70 line is: test
      break iterator: message is: this is a test with sentence
      SENTENCE
      at break iterator, start: 0 end: 28 line is: this is a test with sentence
      break iterator: message is: this is a test with word
      WORD
      at break iterator, start: 0 end: 4 line is: this
      at break iterator, start: 4 end: 5 line is:
      at break iterator, start: 5 end: 7 line is: is
      at break iterator, start: 7 end: 8 line is:
      at break iterator, start: 8 end: 9 line is: a
      at break iterator, start: 9 end: 10 line is:
      at break iterator, start: 10 end: 14 line is: test
      at break iterator, start: 14 end: 15 line is:
      at break iterator, start: 15 end: 19 line is: with
      at break iterator, start: 19 end: 20 line is:
      at break iterator, start: 20 end: 24 line is: word
      break iterator: message is: this is a test with character
      CHARACTER
      at break iterator, start: 0 end: 1 line is: t
      at break iterator, start: 1 end: 2 line is: h
      at break iterator, start: 2 end: 3 line is: i
      at break iterator, start: 3 end: 4 line is: s
      at break iterator, start: 4 end: 5 line is:
      at break iterator, start: 5 end: 6 line is: i
      at break iterator, start: 6 end: 7 line is: s
      at break iterator, start: 7 end: 8 line is:
      at break iterator, start: 8 end: 9 line is: a
      at break iterator, start: 9 end: 10 line is:
      at break iterator, start: 10 end: 11 line is: t
      at break iterator, start: 11 end: 12 line is: e
      at break iterator, start: 12 end: 13 line is: s
      at break iterator, start: 13 end: 14 line is: t
      at break iterator, start: 14 end: 15 line is:
      at break iterator, start: 15 end: 16 line is: w
      at break iterator, start: 16 end: 17 line is: i
      at break iterator, start: 17 end: 18 line is: t
      at break iterator, start: 18 end: 19 line is: h
      at break iterator, start: 19 end: 20 line is:
      at break iterator, start: 20 end: 21 line is: c
      at break iterator, start: 21 end: 22 line is: h
      at break iterator, start: 22 end: 23 line is: a
      at break iterator, start: 23 end: 24 line is: r
      at break iterator, start: 24 end: 25 line is: a
      at break iterator, start: 25 end: 26 line is: c
      at break iterator, start: 26 end: 27 line is: t
      at break iterator, start: 27 end: 28 line is: e
      at break iterator, start: 28 end: 29 line is: r

      D:\src\com\mot\hris\util>
      (Review ID: 108482)
      ======================================================================
      ###@###.### 11/2/04 18:25 GMT

            rgoel Rachna Goel (Inactive)
            kryansunw Kevin Ryan (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: