-
Enhancement
-
Resolution: Not an Issue
-
P4
-
None
-
5.0
-
x86
-
linux
Name: js151677 Date: 06/15/2004
A DESCRIPTION OF THE REQUEST :
With Unicode 4 and supplementary characters, it has become painful to process Strings as char sequences. It would be easier to handle them as int sequences. The low-hanging fruit would be to supply a method
int[] toCodePointArray()
in the String class (and, if you feel generous, a constructor String(int[]) in addition to the existing String(int[], int, int))
JUSTIFICATION :
Consider a typical string processing task--removing characters that match a particular criterion. It is painful to handle the surrogate characters.
In addition, it would be delightful to be able to iterate over the code points with the generalized for loop:
for (int cp : str.toCodePointArray()) { ... }
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Right now, I use a helper method such as this one:
private static int[] getCodePointArray(String str)
{
int[] codePoints = new int[str.codePointCount(0, str.length())];
for (int i = 0, j = 0; i < str.length(); i++, j++)
{
int cp = str.codePointAt(i);
if (Character.isSupplementaryCodePoint(cp)) i++;
codePoints[j] = cp;
}
return codePoints;
}
ACTUAL -
Ugh, don't get me going
(Incident Review ID: 276989)
======================================================================
A DESCRIPTION OF THE REQUEST :
With Unicode 4 and supplementary characters, it has become painful to process Strings as char sequences. It would be easier to handle them as int sequences. The low-hanging fruit would be to supply a method
int[] toCodePointArray()
in the String class (and, if you feel generous, a constructor String(int[]) in addition to the existing String(int[], int, int))
JUSTIFICATION :
Consider a typical string processing task--removing characters that match a particular criterion. It is painful to handle the surrogate characters.
In addition, it would be delightful to be able to iterate over the code points with the generalized for loop:
for (int cp : str.toCodePointArray()) { ... }
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Right now, I use a helper method such as this one:
private static int[] getCodePointArray(String str)
{
int[] codePoints = new int[str.codePointCount(0, str.length())];
for (int i = 0, j = 0; i < str.length(); i++, j++)
{
int cp = str.codePointAt(i);
if (Character.isSupplementaryCodePoint(cp)) i++;
codePoints[j] = cp;
}
return codePoints;
}
ACTUAL -
Ugh, don't get me going
(Incident Review ID: 276989)
======================================================================
- relates to
-
JDK-5003547 (str) add support for iterating over the codepoints in a string
-
- Closed
-