Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P4
Fix Version/s: 17
Affects Version/s: 8, 11, 17
Component/s: core-libs
Labels:

Subcomponent:
java.text
Resolved In Build:
b18
Verification:
Not verified

ADDITIONAL SYSTEM INFORMATION :
java version "1.8.0_112"
Java(TM) SE Runtime Environment (build 1.8.0_112-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.112-b16, mixed mode)

But this has also be reproduced on newer JDK versions e.g., 14.

A DESCRIPTION OF THE PROBLEM :
When a sentence contains text like "blah blah (i.e., blah blah), blah blah" the BreakIterator.getSentenceInstance() incorrectly detects a break after the "i.e" and before the "., blah blah)", but this is not actually a sentence boundary.

FWIW, Stack Overflow discussion here: https://stackoverflow.com/q/66933006/263801

STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Run the test case program below.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
bi.preceding(30) returned -1
first sentence: "Due to a problem (e.g., software bug), the server is down."

ACTUAL -
bi.preceding(30) returned 21
first sentence: "Due to a problem (e.g"

---------- BEGIN SOURCE ----------
import java.text.BreakIterator;
import java.util.Locale;
public class BreakIteratorTest {
    public static void main(String[] args) throws Exception {
        String text = "Due to a problem (e.g., software bug), the server is down.";
        BreakIterator bi = BreakIterator.getSentenceInstance(Locale.US);
        bi.setText(text);
        int r = bi.preceding(30);
        System.out.println("bi.preceding(30) returned " + r);
        String sentence = r == BreakIterator.DONE ? text : text.substring(0, r);
        System.out.println("first sentence: \"" + sentence + "\"");
    }
}

---------- END SOURCE ----------

CUSTOMER SUBMITTED WORKAROUND :
None known

FREQUENCY : always

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

BreakIteratorTest.java
0.6 kB
2021-04-06 04:09

relates to

JDK-8232447 The javadoc parser ends the first sentence of a comment too soon

Open

links to

Commit openjdk/jdk/9ebc497b

Review openjdk/jdk/3400

Assignee:: Naoto Sato
Reporter:: Webbug Group
Votes:: 0 Vote for this issue
Watchers:: 6 Start watching this issue

Created:: 2021-04-04 08:21
Updated:: 2025-01-21 14:28
Resolved:: 2021-04-09 11:12

Details

Description

Attachments

Attachments

Issue Links

Activity

People

Dates