-
CSR
-
Resolution: Withdrawn
-
P3
-
None
-
None
-
behavioral
-
minimal
-
Change to a preview API. The change also addresses a corner case for which the current behavior seems unlikely to be relied upon.
-
Java API
Summary
Amend the documentation of SegmentAllocator::allocateUtf8String to describe what happens if the argument string contain null/0 bytes.
Problem
If a Java string containing null/0 characters is converted to a C string, it might appear truncated depending on format expected by the native code, since a null/0 character can also indicate the terminator of the string.
Similarly, when reading a string in Java through MemorySegment/MemoryAddress.getUtf8String, we treat a \0/null character as the terminator.
Solution
Amend the documentation to describe this behavior explicitly.
Specification
The javadoc has the following diff:
diff --git a/src/java.base/share/classes/java/lang/foreign/MemorySegment.java b/src/java.base/share/classes/java/lang/foreign/MemorySegment.java
index 3b29756fb23..f2f9dd973ce 100644
--- a/src/java.base/share/classes/java/lang/foreign/MemorySegment.java
+++ b/src/java.base/share/classes/java/lang/foreign/MemorySegment.java
@@ -737,6 +737,12 @@ default String getUtf8String(long offset) {
* sequences with this charset's default replacement string. The {@link
* java.nio.charset.CharsetDecoder} class should be used when more control
* over the decoding process is required.
+ * <p>
+ * If the given string contains any {@code '\0'} characters, they will be
+ * copied as well. This means that, depending on the method used to read
+ * the string, such as {@link MemorySegment#getUtf8String(long)}, the string
+ * will appear truncated when read again.
+ *
* @param offset offset in bytes (relative to this segment). For instance, if this segment is a {@linkplain #isNative() native} segment,
* the final address of this write operation can be expressed as {@code address().toRowLongValue() + offset}.
* @param str the Java string to be written into this segment.
diff --git a/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java b/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java
index 095f360e97e..6687936d48c 100644
--- a/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java
+++ b/src/java.base/share/classes/java/lang/foreign/SegmentAllocator.java
@@ -71,6 +71,11 @@ public interface SegmentAllocator {
* sequences with this charset's default replacement byte array. The
* {@link java.nio.charset.CharsetEncoder} class should be used when more
* control over the encoding process is required.
+ * <p>
+ * If the given string contains any {@code '\0'} characters, they will be
+ * copied as well. This means that, depending on the method used to read
+ * the string, such as {@link MemorySegment#getUtf8String(long)}, the string
+ * will appear truncated when read again.
*
* @implSpec the default implementation for this method copies the contents of the provided Java string
* into a new memory segment obtained by calling {@code this.allocate(str.length() + 1)}.
- csr of
-
JDK-8289601 SegmentAllocator::allocateUtf8String(String str) should be clarified for strings containing \0
-
- Resolved
-