Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Affects Version/s: None
Component/s: core-libs
Labels:
- dcspt
- webbug

Subcomponent:
java.lang
CPU:

generic
OS:

generic

A DESCRIPTION OF THE PROBLEM :

Currently, String.codePointCount has only an overload that takes start and end indices. However, developers will expect another overload without arguments to count the code points in the entire string.

Indeed there are some workaround now:

1. str.codePoints().count()
2. str.codePointCount(0, str.length())

However, 1. has extra process (yielding every code point in string), and 2. (1) requires us to assign the string to a variable once, (2) makes the source code more verbose, and (3) has an extra boundary check (https://github.com/openjdk/jdk/blob/0735dc27c71de46896afd2f0f608319304a3d549/src/java.base/share/classes/java/lang/String.java#L1698C9-L1698C66).

=========
Use cases:
=========

if (userName.codePointCount() > 20) {
    IO.println("The user name is too long to store in VARCHAR(20) in utf8mb4 MySQL!");
}
// https://pages.nist.gov/800-63-4/sp800-63b.html#passwordver
// Verifiers and CSPs SHALL require passwords to be a minimum of eight characters in length
// Each Unicode code point SHALL be counted as a single character when evaluating password length.
if (password.codePointCount() < 8) {
    IO.println("Password is too short!");
}

ACTUAL BEHAVIOR :

| Welcome to JShell -- Version 26-ea
| For an introduction type: /help intro

jshell> var str = "𰻞𰻞麺";
str ==> "𰻞𰻞麺"

jshell> str.codePointCount()
| Error:
| method codePointCount in class java.lang.String cannot be applied to given types;
| required: int,int
| found: no arguments
| reason: actual and formal argument lists differ in length
| jshell> str.codePointCount()
| ^----------------^

jshell> str.codePointCount(
Signatures:
int String.codePointCount(int beginIndex, int endIndex)

<press tab again to see documentation>
jshell> jshell> str.codePointCount(
int String.codePointCount(int beginIndex, int endIndex)
Returns the number of Unicode code points in the specified text range of this String .The text
range begins at the specified beginIndex and extends to the char at index endIndex - 1 . Thus
the length (in char s) of the text range is endIndex-beginIndex . Unpaired surrogates within
the text range count as one code point each.

Parameters:
beginIndex - the index to the first char of the text range.
endIndex - the index after the last char of the text range.

Returns:
the number of Unicode code points in the specified text range

Thrown Exceptions:
IndexOutOfBoundsException - if the beginIndex is negative, or endIndex is larger than the
                            length of this String , or beginIndex is larger than endIndex .

<press tab again to see all possible completions; total possible completions: 1,408>

relates to

JDK-4985217 JSR 204:Add codePointCount methods to Character, String and StringBuffer/Builder

Resolved

links to

Review(master) openjdk/jdk/26461

Assignee:: Unassigned

Reporter:: Webbug Group

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Created:: 2025-07-22 18:59

Updated:: 2025-07-28 09:19

Details

Description

Attachments

Issue Links

Activity

People

Dates