Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8364007

Add no-argument codePointCount method to CharSequence and String

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Unresolved
    • Icon: P4 P4
    • None
    • None
    • core-libs
    • generic
    • generic

      A DESCRIPTION OF THE PROBLEM :

      Currently, String.codePointCount has only an overload that takes start and end indices. However, developers will expect another overload without arguments to count the code points in the entire string.

      Indeed there are some workaround now:

      1. str.codePoints().count()
      2. str.codePointCount(0, str.length())

      However, 1. has extra process (yielding every code point in string), and 2. (1) requires us to assign the string to a variable once, (2) makes the source code more verbose, and (3) has an extra boundary check (https://github.com/openjdk/jdk/blob/0735dc27c71de46896afd2f0f608319304a3d549/src/java.base/share/classes/java/lang/String.java#L1698C9-L1698C66).

      =========
      Use cases:
      =========

      if (userName.codePointCount() > 20) {
          IO.println("The user name is too long to store in VARCHAR(20) in utf8mb4 MySQL!");
      }
      // https://pages.nist.gov/800-63-4/sp800-63b.html#passwordver
      // Verifiers and CSPs SHALL require passwords to be a minimum of eight characters in length
      // Each Unicode code point SHALL be counted as a single character when evaluating password length.
      if (password.codePointCount() < 8) {
          IO.println("Password is too short!");
      }

      ACTUAL BEHAVIOR :

      | Welcome to JShell -- Version 26-ea
      | For an introduction type: /help intro

      jshell> var str = "𰻞𰻞麺";
      str ==> "𰻞𰻞麺"

      jshell> str.codePointCount()
      | Error:
      | method codePointCount in class java.lang.String cannot be applied to given types;
      | required: int,int
      | found: no arguments
      | reason: actual and formal argument lists differ in length
      | jshell> str.codePointCount()
      | ^----------------^

      jshell> str.codePointCount(
      Signatures:
      int String.codePointCount(int beginIndex, int endIndex)

      <press tab again to see documentation>
      jshell> jshell> str.codePointCount(
      int String.codePointCount(int beginIndex, int endIndex)
      Returns the number of Unicode code points in the specified text range of this String .The text
      range begins at the specified beginIndex and extends to the char at index endIndex - 1 . Thus
      the length (in char s) of the text range is endIndex-beginIndex . Unpaired surrogates within
      the text range count as one code point each.

      Parameters:
      beginIndex - the index to the first char of the text range.
      endIndex - the index after the last char of the text range.

      Returns:
      the number of Unicode code points in the specified text range

      Thrown Exceptions:
      IndexOutOfBoundsException - if the beginIndex is negative, or endIndex is larger than the
                                  length of this String , or beginIndex is larger than endIndex .

      <press tab again to see all possible completions; total possible completions: 1,408>

            Unassigned Unassigned
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: