Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8352192

JNI Specification: Clarify how to correctly size the buffer for GetStringUTFRegion

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 25
    • hotspot
    • None
    • behavioral
    • minimal
    • This is a clarification of what the programmer needs to do to use the API correctly. There is no behavioural change.
    • Other
    • SE

      Summary

      Clarify that the buffer passed to GetStringUTFRegion should be one more than the length of the given string, to allow for null termination.

      Fix the description of the len parameter so that "start + len must be less than or equal to the string length".

      Problem

      The specification currently states:

      void GetStringUTFRegion(JNIEnv *env, jstring str, jsize start, jsize len, char *buf);

      Translates len number of Unicode characters beginning at offset start into modified UTF-8 encoding and place the result in the given buffer buf.

      The len argument specifies the number of unicode characters. The resulting number modified UTF-8 encoding characters may be greater than the given len argument. GetStringUTFLengthAsLong() may be used to determine the maximum size of the required character buffer.

      Since this specification does not require the resulting string copy be NULL terminated, it is advisable to clear the given character buffer (e.g. "memset()") before using this function, in order to safely perform strlen().

      This text leads programmers to believe that they only need to allocate a buffer of size GetStringUTFLengthAsLong(). However, if an implementation does NULL-terminate "the resulting string copy" then such a buffer will be too small.

      In addition the specification states

      len: the number of Unicode characters to copy. Must be greater than zero, and "start + len" must be less than string length ("GetStringLength()").

      but if we are extracting the entire string then start + len will be equal to the string length.

      Solution

      Expand the text to clarify the need for an extra space in the buffer in case of null termination.

      Correct the description of len to say "less than or equal to the string length".

      Specification

      @@ -2848,7 +2848,9 @@ the required character buffer.  
       Since this specification does not require the resulting string copy be NULL  
       terminated, it is advisable to clear the given character buffer (e.g.  
       "`memset()`") before using this function, in order to safely perform
      -`strlen()`.
      +`strlen()`. To allow for the possibility that the string copy is NULL terminated,
      +the given buffer should include space for a terminating NULL.
      +

      and

       `len`: the number of unicode characters to copy. Must be greater than zero, and
      -"`start + len`" must be less than string length ("`GetStringLength()`").
      +"`start + len`" must be less than, or equal to, the string length.

            dholmes David Holmes
            dholmes David Holmes
            Chris Plummer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Created:
              Updated:
              Resolved: