Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4238263

URLEncoder specification incorrect about encoding algorithm

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.2.2
    • docs
    • sparc
    • solaris_2.5



      Name: mgC56079 Date: 05/14/99



      Javadoc URLEncoder specification says:
      -------------------
      All other characters are converted into the 3-character string "%xy", where xy is the two-digit
            hexadecimal representation of the lower 8-bits of the character.
                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      -------------------

      Actually, in current implementation, each character translates into bytes according to a platform
      default character encoding using CharToByteConverter, not by taking lower 8-bits of the character.

      ------------------- URLEncoder.java
        public static String encode(String s) {
              int maxBytesPerChar = 10;
              StringBuffer out = new StringBuffer(s.length());
              ByteArrayOutputStream buf = new ByteArrayOutputStream(maxBytesPerChar);
              OutputStreamWriter writer = new OutputStreamWriter(buf);

              for (int i = 0; i < s.length(); i++) {
                  int c = (int)s.charAt(i);
                  if (dontNeedEncoding.get(c)) {
                      if (c == ' ') {
                          c = '+';
                      }
                      out.append((char)c);
                  } else {
                      // convert to external encoding before hex conversion
                      try {
                          writer.write(c);
                          writer.flush();
                      } catch(IOException e) {
                          buf.reset();
                          continue;
                      }
                      byte[] ba = buf.toByteArray();
                      for (int j = 0; j < ba.length; j++) {
                          out.append('%');
                          char ch = Character.forDigit((ba[j] >> 4) & 0xF, 16);
                          // converting to use uppercase letter as part of
                          // the hex value if ch is a letter.
                          if (Character.isLetter(ch)) {
                              ch -= caseDiff;
                          }
                          out.append(ch);
                          ch = Character.forDigit(ba[j] & 0xF, 16);
                          if (Character.isLetter(ch)) {
                              ch -= caseDiff;
                          }
                          out.append(ch);
                      }
                      buf.reset();
                  }
              }
          }
      -------------------

      It should be reflected in specification.


      ======================================================================

            shommel Scott Hommel (Inactive)
            gorsunw Gor Gor (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: