Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8279338

In javax.sound.sampled, frames, samples, frame length etc. are not defined clearly enough

    XMLWordPrintable

Details

    • Enhancement
    • Resolution: Unresolved
    • P4
    • tbd
    • 17
    • client-libs
    • None

    Description

      javax.sound.sampled describes audio using terms like samples, frames and their corresponding rates and lengths. They are relatively well defined in the tutorial at https://docs.oracle.com/javase/tutorial/sound/sampled-overview.html It's all easy for uncompressed formats, but gets a little hairy for compressed formats. In the tutorial it says:

      "A frame contains the data for all channels at a particular time. For PCM-encoded data, the frame is simply the set of simultaneous samples in all channels, for a given instant in time, without any additional information. In this case, the frame rate is equal to the sample rate, and the frame size in bytes is the number of channels multiplied by the sample size in bits, divided by the number of bits in a byte.

      For other kinds of encodings, a frame might contain additional information besides the samples, and the frame rate might be completely different from the sample rate. [...] In MP3, each frame contains a bundle of compressed data for a series of samples, not just one sample per channel. Because each frame encapsulates a whole series of samples, the frame rate is slower than the sample rate. The frame also contains a header. Despite the header, the frame size in bytes is less than the size in bytes of the equivalent number of PCM frames. [...] For such an encoding, the sample rate and sample size refer to the PCM data that the encoded sound will eventually be converted into before being delivered to a digital-to-analog converter (DAC)."

      So the frame rate for an mp3 file is typically 38.281250 frames/sec, and the frame length (that's the number of frames in a file) should be duration*frameRate


      AudioFormat
      -----------------

      AudioFormat offers only a tiny bit of documentation regarding compressed formats like mp3. Its says: "However, with some other sorts of encodings a frame can contain a bundle of compressed data for a whole series of samples, as well as additional, non-sample data. For such encodings, the sample rate and sample size refer to the data after it is decoded into PCM, and so they are completely different from the frame rate and frame size."

      The javadocs for getFrameRate() and getFrameSize() don't mention compressed formats at all neither do they give any sort of example.

      This should be improved! We can make this easier for developers and SPI implementers. Why not steal some of the wording from the tutorial and incorporate examples for compressed audio as well?


      AudioFileFormat
      ----------------------

      Unfortunately, AudioFileFormat uses the undefined term "sample frames".
      In https://docs.oracle.com/en/java/javase/17/docs/api/java.desktop/javax/sound/sampled/AudioFileFormat.html#getFrameLength() it says:

      "Obtains the length of the audio data contained in the file, expressed in sample frames."

      Now what are these mysterious "sample frames"?

      The tutorial says about AudioFileFormat that it contains "The length, in frames, of the audio data contained in the file". Following the tutorial, this has nothing to do with "samples", but only with "frames" in the AudioFormat sense. So a 10sec mp3 file should have a frameLength of duration*frameRate=frameLength, i.e. 10sec*38.281250frames/sec=382.8125 frames.

      If that's the correct interpretation of the API, then the javadocs should be changed accordingly, perhaps explicitly mentioning compressed formats and giving a promise that frameLength / frameRate = duration is always true, if neither frameRate nor frameLength are AudioSystem.NOT_SPECIFIED.


      AudioInputStream
      ------------------------

      The AudioInputStream takes a "length" argument, which apparently refers to a frameLength. In the constructor, the parameter length should be changed to frameLength to make this more clear. Also, the javadocs again mention "sample frames", which are undefined. I assume that again the frame definition from the tutorial holds and these "sample frames" are actually "frames" and could be one sample for each channel or, in the case of mp3, a whole lot of samples contained in an mp3 frame.

      But there is a problem: AudioInputStream contains a logic that prevents users to read past the given frameLength https://github.com/openjdk/jdk/blob/0a65e8b282fd41e57108422fbd140527d9697efd/src/java.desktop/share/classes/javax/sound/sampled/AudioInputStream.java#L261 For this the class uses the given frameSize from the AudioFormat object. Unfortunately, this logic breaks, as soon as the frameSize is unknown, because then it's assumed to be 1 (see https://github.com/openjdk/jdk/blob/0a65e8b282fd41e57108422fbd140527d9697efd/src/java.desktop/share/classes/javax/sound/sampled/AudioInputStream.java#L130). Therefore, we must only check whether we are reading beyond frameLength, if *both* frameLength and frameSize are given. The code should be changed accordingly.


      Other Audio APIs
      -----------------------

      It might be worthwhile to point out that other audio frameworks use the same terms with different meaning. For example, in Apple Core Audio (https://developer.apple.com/documentation/coreaudiotypes/audiostreambasicdescription) a javax.sound.sampled "frame" for a compressed encoding would be referred to as a "packet". And a "frame" is defined as "a collection of time-coincident samples. For instance, a linear PCM stereo sound file has two samples per frame, one for the left channel and one for the right channel.", i.e. as a subset of what javax.sound.sampled calls a "frame".
      This kind of discrepancy in nomenclature makes it extra important to have clear definitions somewhere in the javadocs and not just in the tutorial.

      Attachments

        Activity

          People

            kizune Alexander Zuev
            hschreiber Hendrik Schreiber
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated: