Loading...

XML

Word

Printable

Type: CSR
Resolution: Withdrawn
Priority: P3
Fix Version/s: None
Component/s: core-libs
Labels:
None

Subcomponent:
java.lang.foreign
Compatibility Kind:

behavioral
Compatibility Risk:
minimal
Compatibility Risk Description:
Change to a preview API. The change also addresses a corner case for which the current behavior seems unlikely to be relied upon.
Interface Kind:

Java API

Summary

Amend the documentation of SegmentAllocator::allocateUtf8String to describe what happens if the argument string contain null/0 bytes.

Problem

If a Java string containing null/0 characters is converted to a C string, it might appear truncated depending on format expected by the native code, since a null/0 character can also indicate the terminator of the string.

Similarly, when reading a string in Java through MemorySegment/MemoryAddress.getUtf8String, we treat a \0/null character as the terminator.

Solution

Amend the documentation to describe this behavior explicitly.

Specification

The javadoc has the following diff:

diff --git a/src/java.base/share/classes/java/lang/foreign/MemorySegment.java b/src/java.base/share/classes/java/lang/foreign/MemorySegment.java
index 3b29756fb23..f2f9dd973ce 100644
--- a/src/java.base/share/classes/java/lang/foreign/MemorySegment.java
+++ b/src/java.base/share/classes/java/lang/foreign/MemorySegment.java
@@ -737,6 +737,12 @@ default String getUtf8String(long offset) {
      * sequences with this charset's default replacement string.  The {@link
      * java.nio.charset.CharsetDecoder} class should be used when more control
      * over the decoding process is required.
+     * <p>
+     * If the given string contains any {@code '\0'} characters, they will be
+     * copied as well. This means that, depending on the method used to read
+     * the string, such as {@link MemorySegment#getUtf8String(long)}, the string
+     * will appear truncated when read again.
+     *
      * @param offset offset in bytes (relative to this segment). For instance, if this segment is a {@linkplain #isNative() native} segment,
      *               the final address of this write operation can be expressed as {@code address().toRowLongValue() + offset}.
      * @param str the Java string to be written into this segment.
diff --git a/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java b/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java
index 095f360e97e..6687936d48c 100644
--- a/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java
+++ b/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java
@@ -71,6 +71,11 @@ public interface SegmentAllocator {
      * sequences with this charset's default replacement byte array.  The
      * {@link java.nio.charset.CharsetEncoder} class should be used when more
      * control over the encoding process is required.
+     * <p>
+     * If the given string contains any {@code '\0'} characters, they will be
+     * copied as well. This means that, depending on the method used to read
+     * the string, such as {@link MemorySegment#getUtf8String(long)}, the string
+     * will appear truncated when read again.
      *
      * @implSpec the default implementation for this method copies the contents of the provided Java string
      * into a new memory segment obtained by calling {@code this.allocate(str.length() + 1)}.

csr of

JDK-8289601 SegmentAllocator::allocateUtf8String(String str) should be clarified for strings containing \0

Resolved

Assignee:: Jorn Vernee

Reporter:: Leonid Kuskov

Votes:: 0 Vote for this issue

Watchers:: 1 Start watching this issue

Created:: 2022-07-04 06:01

Updated:: 2022-07-08 07:04

Resolved:: 2022-07-08 07:04

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates