-
Enhancement
-
Resolution: Unresolved
-
P4
-
None
-
8, 9
A DESCRIPTION OF THE REQUEST :
The java.net.URI.getHost() method returns null if a given URI is of the form "http://aaa.daaa/", where "a" denotes an alpha character and "d" a digit character.
This is also mentioned in the code doc for getHost:
<li><p> A domain name consisting of one or more <i>labels</i>
separated by period characters ({@code '.'}), optionally followed by
a period character. Each label consists of <i>alphanum</i> characters
as well as hyphen characters ({@code '-'}), though hyphens never
occur as the first or last characters in a label. The rightmost
label of a domain name consisting of two or more labels, begins
with an <i>alpha</i> character. </li>
It looks like Java follows RFC-952 here. However, RFC 1123 states:
The syntax of a legal Internet host name was specified in RFC-952
[DNS:4]. One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a
letter or a digit. Host software MUST support this more liberal
syntax.
We expect the latter behaviour because our Infrastructure has systems with DNS names like "foo.10int000".
JUSTIFICATION :
With RFC 1123 being the current state, infrastructures may exist with hostnames according to the new rules. JDK applications will not be able to correctly parse URIs where the hostname is valid according to the RFC.
---------- BEGIN SOURCE ----------
import java.net.URI;
public class UriTest {
public static void main(String[] args) {
try {
System.out.println(new URI("http://foo.bar").getHost()); // prints "foo.bar"
System.out.println(new URI("http://foo.1bar").getHost()); // prints "null" instead of "foo.1bar"
} catch (Exception e) {
System.err.println(e.getMessage());
}
}
}
---------- END SOURCE ----------
The java.net.URI.getHost() method returns null if a given URI is of the form "http://aaa.daaa/", where "a" denotes an alpha character and "d" a digit character.
This is also mentioned in the code doc for getHost:
<li><p> A domain name consisting of one or more <i>labels</i>
separated by period characters ({@code '.'}), optionally followed by
a period character. Each label consists of <i>alphanum</i> characters
as well as hyphen characters ({@code '-'}), though hyphens never
occur as the first or last characters in a label. The rightmost
label of a domain name consisting of two or more labels, begins
with an <i>alpha</i> character. </li>
It looks like Java follows RFC-952 here. However, RFC 1123 states:
The syntax of a legal Internet host name was specified in RFC-952
[DNS:4]. One aspect of host name syntax is hereby changed: the
restriction on the first character is relaxed to allow either a
letter or a digit. Host software MUST support this more liberal
syntax.
We expect the latter behaviour because our Infrastructure has systems with DNS names like "foo.10int000".
JUSTIFICATION :
With RFC 1123 being the current state, infrastructures may exist with hostnames according to the new rules. JDK applications will not be able to correctly parse URIs where the hostname is valid according to the RFC.
---------- BEGIN SOURCE ----------
import java.net.URI;
public class UriTest {
public static void main(String[] args) {
try {
System.out.println(new URI("http://foo.bar").getHost()); // prints "foo.bar"
System.out.println(new URI("http://foo.1bar").getHost()); // prints "null" instead of "foo.1bar"
} catch (Exception e) {
System.err.println(e.getMessage());
}
}
}
---------- END SOURCE ----------