Loading...

XML

Word

Printable

Type: Bug
Resolution: Fixed
Priority: P4
Fix Version/s: 19
Affects Version/s: 8, 11, 17, 18
Component/s: core-libs
Labels:
- 18ea
- dcsaw
- hgupdate-sync
- noreg-doc
- reproducer-yes
- webbug

Subcomponent:
java.util
Resolved In Build:
b03
CPU:

generic
OS:

generic

Issue	Fix Version	Assignee	Priority	Status	Resolution	Resolved In Build
JDK-8279734	18.0.1	Naoto Sato	P4	Resolved	Fixed	b02
JDK-8278959	18	Naoto Sato	P4	Resolved	Fixed	b29

A DESCRIPTION OF THE PROBLEM :
https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/StringTokenizer.html#%3Cinit%3E(java.lang.String,java.lang.String,boolean) said: "Each delimiter is returned as a string of length one." This is not correct if any of the delimiter is a valid Unicode surrogate pair since the returned string will be of length two because the delimiter is represented by two code units.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
"Each delimiter is returned as a string of the code unit(s) of the delimiter."

Or remove "Each delimiter is returned as a string of length one." and clarify that "characters" in StringTokenizer documentation context refers to Unicode code points like other documentation, e.g., that of String: "The String class provides methods for dealing with Unicode code points (i.e., characters), in addition to those for dealing with Unicode code units (i.e., char values)." - https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/String.html.
ACTUAL -
"Each delimiter is returned as a string of length one."

---------- BEGIN SOURCE ----------
import java.util.StringTokenizer;

public class StringTokenizerPlayground {

  public static void main(String[] args) {
    final var s = "\uD83D\uDE00"; // Grinning Face
    final var tokenizer = new StringTokenizer(s, s, true);

    final var tokenCount = tokenizer.countTokens();

    if (tokenCount != 1) {
      throw new AssertionError();
    }

    final var token = tokenizer.nextToken();

    if (token.length() != 2) {
      throw new AssertionError();
    }

    if (!token.equals(s)) {
      throw new AssertionError();
    }
  }
}
---------- END SOURCE ----------

FREQUENCY : always

- - Sort By Name
  - Sort By Date
  - Ascending
  - Descending
  - Thumbnails
  - List
  - Download All

StringTokenizerPlayground.java
0.6 kB
2021-12-12 01:18

backported by

JDK-8278959 StringTokenizer(String, String, boolean) documentation bug

Resolved

JDK-8279734 StringTokenizer(String, String, boolean) documentation bug

Resolved

csr for

JDK-8278814 StringTokenizer(String, String, boolean) documentation bug

Closed

links to

Commit openjdk/jdk18/9cd70906

Commit openjdk/jdk/8f5fdd86

Review openjdk/jdk18/43

Review openjdk/jdk/6836

(2 links to)

Assignee:: Naoto Sato

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 5 Start watching this issue

Created:: 2021-12-10 05:23

Updated:: 2022-01-10 09:32

Resolved:: 2021-12-16 13:42

Details

Backports

Description

Attachments

Attachments

Issue Links

Activity

People

Dates