Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Unresolved
Priority: P4
Fix Version/s: None
Affects Version/s: 6
Component/s: core-libs
Labels:
- webbug

Subcomponent:
java.text
CPU:

x86
OS:

solaris_8

A DESCRIPTION OF THE REQUEST :
While writing some code to get the maximum common prefix of two unicode CharSequences, it became apparent the the current API was not sufficient for an efficient implementation. Suggested changes:

Collator.compare(String, String) -> Collator.compare(CharSequence, CharSequence)

  Suggested additions:

Collator.compare(int codepoint1, int codepoint2)
Character.toString(int codepoint)

JUSTIFICATION :
While writing some code to get the maximum common prefix of two unicode CharSequences, it became apparent the the current API was not sufficient for an efficient implementation. See the attached source code for an example. Basically, Strings are immutable and the only comparison provided by the Collator is string based, rather than the more generic CharSequence. If the data you are processing is not stored as strings, then you are forced to allocate strings to do basic processing. Also, since there is no API for comparing single codepoints, doing processing like finding the max common prefix requires up to (# of codepoints in smaller sequence * 2) memory allocations.

EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Require fewer memory allocations when doing unicode processing of CharSequences.
ACTUAL -
For example, currently requires (# of codepoints in smaller sequence * 2) memory allocations to find maximum common prefix of 2 unicode CharSequences.

---------- BEGIN SOURCE ----------
private static int getLengthOfMaxCommonPrefix(CharSequence str1, CharSequence str2, Collator collator) {
    if ((str1 == null) || (str2 == null)) { return 0; }
    if (Character.codePointCount(str1, 0, str1.length()) > Character.codePointCount(str2, 0, str2.length())) {
      CharSequence tmp = str1;
      str1 = str2;
      str2 = tmp;
    }
    // @todo get rid of memory allocation
    char[] charArray = new char[4];
    int i = 0;
    for (int size = Character.codePointCount(str1, 0, str1.length()); i < size; i++) {
      Character.toChars(Character.codePointAt(str1, i), charArray, 0);
      Character.toChars(Character.codePointAt(str2, i), charArray, 2);
       // @todo get rid of memory allocation
      String char1Str = new String(charArray, 0, 2);
      // @todo get rid of memory allocation
      String char2Str = new String(charArray, 2, 2);
      if (collator.compare(char1Str, char2Str) != 0) {
        return i;
      }
    }
    return i;
  }

---------- END SOURCE ----------

relates to

JDK-8035473 [javadoc] Revamp the existing Doclet APIs

Closed

JDK-8137326 Methods for comparing CharSequence, StringBuilder, and StringBuffer

Resolved

Assignee:: Naoto Sato

Reporter:: Nelson Dcosta (Inactive)

Votes:: 2 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2008-02-18 23:32

Updated:: 2019-04-11 10:55

Imported:: 16/Sep/12 8:38 AM

Indexed:: 18/Jul/12 4:06 AM

Details

Description

Attachments

Issue Links

Activity

People

Dates