javadoc now uses a sentence BreakIterator to find the end of
the first sentence to use for the summary. Some common
constructs cause it to break too soon. For example:
CASE #1 --------------------------------------------------
import java.text.BreakIterator;
public class SentenceBug {
public static void main(String[] argv) {
BreakIterator bi = BreakIterator.getSentenceInstance();
String test = "Test <code>Flags.Flag</code> class. Another test.";
bi.setText(test);
System.out.println(test.substring(bi.first(), bi.next()));
System.exit(0);
}
}
This prints "Test <code>Flags."
A period followed by a capital letter should not be a sentence boundary;
there should be whitespace between them.
CASE #2 --------------------------------------------------
import java.text.BreakIterator;
public class SentenceBug2 {
public static void main(String[] argv) {
BreakIterator bi = BreakIterator.getSentenceInstance();
String test = "<P>Provides a set of "lightweight" (all-Java<FONT SIZE=\"-2\"><SUP>TM</SUP></FONT> language) components that, to the maximum degree possible, work the same on all platforms. Another test.";
bi.setText(test);
System.out.println(test.substring(bi.first(), bi.next()));
System.exit(0);
}
}
This prints:
<P>Provides a set of "lightweight" (all-Java<FONT SIZE="-2"
Notice that it stops between the double quote (") and greater-than symbol (>).
There is no period, exclamation mark or question mark anywhere near.
----
For sample files, see /java/web/docs/bugs/javadoc-bugs/bug4158381-breakiterator
doug.kramer@Eng 1998-09-17
the first sentence to use for the summary. Some common
constructs cause it to break too soon. For example:
CASE #1 --------------------------------------------------
import java.text.BreakIterator;
public class SentenceBug {
public static void main(String[] argv) {
BreakIterator bi = BreakIterator.getSentenceInstance();
String test = "Test <code>Flags.Flag</code> class. Another test.";
bi.setText(test);
System.out.println(test.substring(bi.first(), bi.next()));
System.exit(0);
}
}
This prints "Test <code>Flags."
A period followed by a capital letter should not be a sentence boundary;
there should be whitespace between them.
CASE #2 --------------------------------------------------
import java.text.BreakIterator;
public class SentenceBug2 {
public static void main(String[] argv) {
BreakIterator bi = BreakIterator.getSentenceInstance();
String test = "<P>Provides a set of "lightweight" (all-Java<FONT SIZE=\"-2\"><SUP>TM</SUP></FONT> language) components that, to the maximum degree possible, work the same on all platforms. Another test.";
bi.setText(test);
System.out.println(test.substring(bi.first(), bi.next()));
System.exit(0);
}
}
This prints:
<P>Provides a set of "lightweight" (all-Java<FONT SIZE="-2"
Notice that it stops between the double quote (") and greater-than symbol (>).
There is no period, exclamation mark or question mark anywhere near.
----
For sample files, see /java/web/docs/bugs/javadoc-bugs/bug4158381-breakiterator
doug.kramer@Eng 1998-09-17
- duplicates
-
JDK-4143071 Sentence BreakIterator has trouble with sentence which end with a date/number
-
- Closed
-
- relates to
-
JDK-4140384 design bug: ambiguous "first sentence" rule
-
- Closed
-
-
JDK-4172961 First sentence does not stop if it ends in a capital letter - propose {@period}
-
- Closed
-