-
Bug
-
Resolution: Duplicate
-
P4
-
None
-
1.2.2
-
sparc
-
solaris_2.5
Name: mgC56079 Date: 05/14/99
Javadoc URLEncoder specification says:
-------------------
All other characters are converted into the 3-character string "%xy", where xy is the two-digit
hexadecimal representation of the lower 8-bits of the character.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-------------------
Actually, in current implementation, each character translates into bytes according to a platform
default character encoding using CharToByteConverter, not by taking lower 8-bits of the character.
------------------- URLEncoder.java
public static String encode(String s) {
int maxBytesPerChar = 10;
StringBuffer out = new StringBuffer(s.length());
ByteArrayOutputStream buf = new ByteArrayOutputStream(maxBytesPerChar);
OutputStreamWriter writer = new OutputStreamWriter(buf);
for (int i = 0; i < s.length(); i++) {
int c = (int)s.charAt(i);
if (dontNeedEncoding.get(c)) {
if (c == ' ') {
c = '+';
}
out.append((char)c);
} else {
// convert to external encoding before hex conversion
try {
writer.write(c);
writer.flush();
} catch(IOException e) {
buf.reset();
continue;
}
byte[] ba = buf.toByteArray();
for (int j = 0; j < ba.length; j++) {
out.append('%');
char ch = Character.forDigit((ba[j] >> 4) & 0xF, 16);
// converting to use uppercase letter as part of
// the hex value if ch is a letter.
if (Character.isLetter(ch)) {
ch -= caseDiff;
}
out.append(ch);
ch = Character.forDigit(ba[j] & 0xF, 16);
if (Character.isLetter(ch)) {
ch -= caseDiff;
}
out.append(ch);
}
buf.reset();
}
}
}
-------------------
It should be reflected in specification.
======================================================================
- duplicates
-
JDK-4257115 URLEncoder and URLDecoder should support target character sets
- Resolved