Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4406592

HttpURLConnection fails on some valid URLs with FileNotFoundException

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: P4 P4
    • None
    • 1.3.0
    • core-libs
    • sparc
    • solaris_2.6



      Name: boT120536 Date: 01/21/2001


      java version "1.3.0"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0)
      Java HotSpot(TM) Client VM (build 1.3.0, mixed mode)

      The sun.net.www.protocol.http.HttpURLConnection class fails to correctly allow
      access to some valid URLs. The failure occurs under Solaris 2.6. It occurs
      both in JDK 1.3.0 and also JDK 1.2.2. It also occurs under MacOS X beta (which
      I realize it not supported by Sun). It does NOT fail under Linux.

      The URL in question is one that is not at our site, but I include a test
      function that should illustrate the failure. The test code below will
      successfully access the first 3 URLs:
         http://www.isi.edu
         http://citeseer.nj.nec.com/
         http://citeseer.nj.nec.com/correct/

         http://citeseer.nj.nec.com/correct/163004
         http://citeseer.nj.nec.com/correct/163004/

       and then fail on the fourth and fifth.
      All five URLs can be successfully read via the Netscape browser.
      All five URLs can be successfully read, and return response code 200 when
      a telnet connection is made to port 80 on the host and the request sent manually
      in either HTTP/1.0 or HTTP/1.1 format.

      Source code to demonstrate the problem:

      import java.net.*;
      import java.util.*;
      import java.io.*;

      public class HttpTest {

          public static void getPage(String urlString) throws IOException {
              // Retrieve the page at `urlString' and print the first 500 bytes.

              URL url = new URL(urlString);
        InputStream pageStream;
      int ch;
      int count = 0;
      HttpURLConnection connection = null;

        if(url.getProtocol().equalsIgnoreCase("http")){
      try {
      System.out.println("==== RETRIEVING " + url + " ====");
      System.out.println();
      connection = (HttpURLConnection) url.openConnection();
      System.err.println("HttpURLConnection opened: " + connection);
      pageStream = connection.getInputStream();
      System.err.println("HttpURLConnection input Stream: " + pageStream);
      System.err.print("Response: ");
      System.err.print(connection.getResponseCode());
      System.err.println(" " + connection.getResponseMessage());
      System.out.println();
      System.out.println("==== CONTENT ====");
      System.out.println();
      for (ch = pageStream.read() ; ch !=-1 ; ch = pageStream.read()) {
      if (++count < 500) {
      System.out.write(ch);
      } else if (count == 500) {
      System.out.println();
      System.out.println("<More...>");
      }
      }
      System.out.println();
      System.out.println("==== DONE " + count + " bytes ====");
      System.out.println();
      } catch (Exception e) {
      System.err.println();
      System.err.println("*** ERROR: " + e);
      e.printStackTrace();
      System.err.println();
      }
      }
          }

        public static void main(String args[]) {
          // Run a loop with test URLs via the Java http URL support.

          // All of these urls work from a browser.
          // All of them work from Linux
          // All of them work manually telnetting to port 80
          // and issuing a "GET <url> HTTP/1.0" command.
          //
          // Two different "failures" occur in other systems:
          // The latter two fail on Solaris, jdk 1.3.0 and jdk 1.2.2
          // The latter two fail on MacOS X jdk 1.2.2
          String[] urls
            = new String [] {"http://www.isi.edu",
      "http://citeseer.nj.nec.com/",
      "http://citeseer.nj.nec.com/correct/",
      "http://citeseer.nj.nec.com/correct/163004", // Fails with error
      "http://citeseer.nj.nec.com/correct/163004/" // Fails with busy
      };

          for (int i = 0; i < urls.length; i++) {
            try {
      getPage(urls[i]);
            } catch (Exception e) {
      System.err.println();
      System.err.println("**** Error: " + e);
      e.printStackTrace();
      System.err.println();
            }
          }
        }
      }



      Sample Trace of the program running on our system:

      ==== RETRIEVING http://www.isi.edu ====

      HttpURLConnection opened:
      sun.net.www.protocol.http.HttpURLConnection:http://www.isi.edu
      HttpURLConnection input Stream: www.http.KeepAliveStream@4b222f'>sun.net.www.http.KeepAliveStream@4b222f
      Response: 200 OK

      ==== CONTENT ====

      <HTML>

      <HEAD><TITLE>USC Information Sciences
      Institute</TITLE></HEAD>

      <BODY BACKGROUND="images/bg-nologo.jpg"
      TEXT="#000000" LINK="#AA0000" VLINK="#111111">


      <MAP NAME="ISI">

      <AREA SHAPE=rect HREF="http://www.isi.edu/about.html"
      COORDS="18,112,122,151">
      <AREA SHAPE=rect
      HREF="http://www.isi.edu/publications.html" COORDS="16,154,121,192">

      <AREA SHAPE=rect HREF="http://www.isi.edu/servicelist.html"
      COORDS="17,195,120,233">
      <AREA SHAPE=rect
      HREF="http://www.isi.edu/divisions/main/index.
      <More...>

      ==== DONE 3120 bytes ====

      ==== RETRIEVING http://citeseer.nj.nec.com/ ====

      HttpURLConnection opened:
      sun.net.www.protocol.http.HttpURLConnection:http://citeseer.nj.nec.com/
      HttpURLConnection input Stream: www.http.KeepAliveStream@2125f0'>sun.net.www.http.KeepAliveStream@2125f0
      Response: 200 OK

      ==== CONTENT ====

      <html><head><TITLE>ResearchIndex: The NECI Scientific Literature Digital Library
      [Steve Lawrence, Kurt Bollacker, Lee Giles, NEC Research Institute]</TITLE>
       <!70>
      <META name="description" content="ResearchIndex (formerly CiteSeer): The NECI
      Scientific Literature Digital Library. Autonomously creates citation indexes of
      scientific literature. Advantages in terms of availability, coverage,
      timeliness, and efficiency. Generates citation statistics and allows easy
      browsing of the context of citati
      <More...>

      ==== DONE 11077 bytes ====

      ==== RETRIEVING http://citeseer.nj.nec.com/correct/ ====

      HttpURLConnection opened:
      sun.net.www.protocol.http.HttpURLConnection:http://citeseer.nj.nec.com/correct/
      HttpURLConnection input Stream: www.http.KeepAliveStream@41cd1f'>sun.net.www.http.KeepAliveStream@41cd1f
      Response: 200 OK

      ==== CONTENT ====

      <html><head><TITLE>ResearchIndex: The NECI Scientific Literature Digital Library
      [Steve Lawrence, Kurt Bollacker, Lee Giles, NEC Research Institute]</TITLE>
      <!9>
      <META name="description" content="ResearchIndex (formerly CiteSeer): The NECI
      Scientific Literature Digital Library. Autonomously creates citation indexes of
      scientific literature. Advantages in terms of availability, coverage,
      timeliness, and efficiency. Generates citation statistics and allows easy
      browsing of the context of citation
      <More...>

      ==== DONE 11097 bytes ====

      ==== RETRIEVING http://citeseer.nj.nec.com/correct/163004 ====

      HttpURLConnection opened:
      sun.net.www.protocol.http.HttpURLConnection:http://citeseer.nj.nec.com/correct/163004

      *** ERROR: java.io.FileNotFoundException:
      http://citeseer.nj.nec.com/correct/163004
      java.io.FileNotFoundException: http://citeseer.nj.nec.com/correct/163004
      at
      sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:545)
      at ir_tools.HttpTest2.getPage(HttpTest2.java:24)
      at ir_tools.HttpTest2.main(HttpTest2.java:62)

      ==== RETRIEVING http://citeseer.nj.nec.com/correct/163004/ ====

      HttpURLConnection opened:
      sun.net.www.protocol.http.HttpURLConnection:http://citeseer.nj.nec.com/correct/163004/
      HttpURLConnection input Stream: www.MeteredStream@31f71a'>sun.net.www.MeteredStream@31f71a
      Response: 503 System busy

      ==== CONTENT ====

      <!DOCTYPE HTML
      PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
      "http://www.w3.org/TR/html4/loose.dtd">
      <HTML LANG="en-US"><HEAD><TITLE>ResearchIndex [NEC Research Institute; Steve
      Lawrence, Kurt Bollacker, Lee Giles; Computer Science]</TITLE>
      <LINK REV=MADE HREF="mailto:lawrence%40research.nj.nec.com">
      <BASE HREF="http://citeseer.nj.nec.com/correct/163004/">
      <META NAME="description" CONTENT="ResearchIndex (CiteSeer): Scientific
      Literature Digital Library incorporating autonomous citation inde
      <More...>

      ==== DONE 1530 bytes ====
      (Review ID: 115411)
      ======================================================================

            alanb Alan Bateman
            bonealsunw Bret O'neal (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: