Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-7069177

guessContentTypeFromStream incorrectly assesses file type

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Unresolved
    • Icon: P4 P4
    • None
    • 7
    • core-libs
    • x86
    • windows_xp

      FULL PRODUCT VERSION :
      found in Java 7 sources, present in Java 6/7

      ADDITIONAL OS VERSION INFORMATION :
      probably any (source bug)

      A DESCRIPTION OF THE PROBLEM :
      trying to use guessContentTypeFromStream on any stream containing Microsoft RIFF container other than the actual WAV results in identifying the file as WAV, Per source,
      if (c1 == 'R' && c2 == 'I' && c3 == 'F' && c4 == 'F') {
      /* I don't know if this is official but evidence
      * suggests that .wav files start with "RIFF" - brown
      */
      return "audio/x-wav";
      }

      - obviously brown is wrong, see links below for detailed rationale; while WAV files do have RIFF signature, 'RIFF' 4CC only signifies a RIFF container has been found. Any RIFF (WAV/AVI/RMI/CDR etc) has RIFF 4CC at header start; it's necessary to peek into next bytes to be able to accurately identify the file type.

      http://en.wikipedia.org/wiki/Resource_Interchange_File_Format
      http://oreilly.com/www/centers/gff/formats/micriff/

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      try to use guessContentTypeFromStream on any stream containing Microsoft RIFF container

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      identify the container based on first encountered chunk type in RIFF file
      ACTUAL -
      always erroneously identified RIFF as WAV

      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      no error reported by VM

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      replacing
      if (c1 == 'R' && c2 == 'I' && c3 == 'F' && c4 == 'F') {
      /* I don't know if this is official but evidence
      * suggests that .wav files start with "RIFF" - brown
      */
      return "audio/x-wav";
      }
      with
      if (c1 == 'R' && c2 == 'I' && c3 == 'F' && c4 == 'F') { // Microsoft RIFF container
                    if (c9 == 'W' && c10 == 'A' && c11 == 'V' && c12 == 'E' )
      return "audio/x-wav";
                    if (c9 == 'R' && c10 == 'M' && c11 == 'I' && c11 == 'D' )
                      return "audio/mid";
                   if (c9 == 'A' && c10 == 'V' && c11 == 'I' && c11 == ' ' )
                      return "video/avi";
                   // and so on, for other RIFF formats such as CDR, ANI, DLS, WebP etc
                  }
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Don't use guessContentTypeFromStream and use own method designed for that case.

            Unassigned Unassigned
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Imported:
              Indexed: