Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4086052

BreakIterator doesn't honor Unicode non-breaking space

XMLWordPrintable

    • 1.1.6
    • x86
    • windows_95
    • Verified



        Name: paC48320 Date: 10/14/97


        The Unicode character \u00a0 is supposed to
        represent a non-breaking space. Thus I would
        expect that a BreakIterator.getLineInstance()
        iterator would not use that character as a valid
        break position. However, it seems to treat it
        the same as a regular space.

        Here's a test program:


        import java.text.*;

        class breaktest
        {

        public static void main(String args[])
        {
        String s;

        s = "foo\u00a0bar";

        BreakIterator boundary = BreakIterator.getLineInstance();
        boundary.setText(s);

        int start = boundary.first();
        int end;

        while (true)
        {
        end = boundary.next();
        if (end == BreakIterator.DONE)
        break;
        System.out.println("start = " + start + " end = " + end);
        System.out.println(s.substring(start, end));
        start = end;
        }
        }

        }





        I would expect that this would print only one run
        of text. However, I get two runs: "foo\u00a0"
        and "bar".
        company - Lotus Development Corp. , email - ###@###.###
        ======================================================================

              joconnersunw John Oconner (Inactive)
              pallenba Peter Allenbach (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: