Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8318130

SocksSocketImpl needlessly encodes hostname for IPv6 addresses

XMLWordPrintable

    • b21
    • generic
    • generic
    • Verified

      ADDITIONAL SYSTEM INFORMATION :
      Is present in all the versions from at least late Java 9

      A DESCRIPTION OF THE PROBLEM :
      In https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/java/net/SocksSocketImpl.java#L306 , we can see that new URI is constructed by

          uri = new URI("socket://" + ParseUtil.encodePath(host) + ":"+ epoint.getPort());

      , where `host` is the IPv6 address with the surrounding braces added (e.g. `[fd00:baba:babe:0:0:0:0:862]` )

      This, however, is not what the URI c-tor expects, as to properly parse the URI it needs the literal braces in the string, see e.g.

      void test(String s) throws URISyntaxException {
              URI uri = new URI(s);
              System.out.println("===");
              System.out.println("Scheme: " + uri.getScheme());
              System.out.println("Host: " + uri.getHost());
              System.out.println("Port: " + uri.getPort());
              System.out.println("Authority: " + uri.getAuthority());
          }

          test("socket://[fd00:baba:babe:0:0:0:0:862]:55507");
          test("socket://%5fd00:baba:babe:0:0:0:0:862%5d:55507");
          test("socket://fd00:baba:babe:0:0:0:0:862:55507");

      As a result, the proper way to handle that would be just

          uri = new URI("socket://" + host + ":"+ epoint.getPort());

      Current code, unsurprisingly, causes an internal URISyntaxException during URI string parse, which then is internally caught and backtracks and puts the entire string into Authority part (with null host and port == -1). This, by itself, should theoretically crash the following

          iProxy = sel.select(uri).iterator();

      call due to incomplete uri, however usually it just works due to an unrelated workaround provided in the most common `DefaultProxySelector#select(URI uri)` implementation (see https://github.com/openjdk/jdk/blob/jdk-10%2B24/src/java.base/share/classes/sun/net/spi/DefaultProxySelector.java#L168 ), which has this:

              if (host == null) {
                  // This is a hack to ensure backward compatibility in two
                  // cases: 1. hostnames contain non-ascii characters,
                  // internationalized domain names. in which case, URI will
                  // return null, see BugID 4957669; 2. Some hostnames can
                  // contain '_' chars even though it's not supposed to be
                  // legal, in which case URI will return null for getHost,
                  // but not for getAuthority() See BugID 4913253
      // followed by code that manually parses the authority via indexOf(':') and substring()

      (which obviously, as per description, is not even a workaround for this case!) and handles the invalid port in URI because it gets port number via other means (not from uri var).

      This is both unexpected, is a result of a very likely coding oversight, has performance downsides (internal needless exceptions created with stacktrace populated, needless workarounds executed), and thankfully can be trivially fixed.

      FWIW, it seems this was lurking in this place from the very inception of this file in the source repo, i.e. JDK 8, maybe all the way back to JDK 1.4.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      * trigger the code in question with a `Socket#connect()` call to a SOCKS socket and with attached debugger with "stop on exceptions"
      * observe the code flow

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      no internal exceptions will be triggered on valid input data and no needless workarounds will be executed
      ACTUAL -
      URISyntaxException is triggered, workaround for unhandled URIs is executed

      ---------- BEGIN SOURCE ----------
          import java.net.*;

          new Socket("[fd00:baba:babe:0:0:0:0:862]", 55507).connect(new InetSocketAddress(12345));

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      none needed; the current library code flow already contains a workaround

      FREQUENCY : always


            djelinski Daniel Jelinski
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            9 Start watching this issue

              Created:
              Updated:
              Resolved: