Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8368232

Improve robustness of String constructors with mutable array inputs

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 21-pool
    • core-libs
    • None
    • behavioral
    • minimal
    • The compatibility risk is minimal, no behavior is specified for modifying the arguments.
    • Java API
    • Implementation

      Summary

      JDK implementation specific change to hardens the String constructors implementation where the data used to construct the String is modified during construction

      Problem

      Strings, after construction, are immutable but may be constructed from mutable arrays of bytes, characters, or integers. The string constructors should guard against the effects of mutating the arrays during construction that might invalidate internal invariants for the correct behavior of operations on the resulting strings. In particular, a number of operations have optimizations for operations on pairs of latin1 strings and pairs of non-latin1 strings, while operations between latin1 and non-latin1 strings use a more general implementation.

      Solution

      This is a JDK implementation specific change which hardens the String constructor implementation where the data used to construct the String is modified during construction. Ensure that strings identified as non-Latin1 contain at least one non-Latin1 character.

      For Latin1 inputs—whether the arrays are encoded in ASCII, ISO-8859-1, UTF-8, or any other encoding decoded to Latin1—the scanning and compression processes remain unchanged.

      If a non-Latin1 character is detected, the string is flagged as non-Latin1, with the added verification that a non-Latin1 character exists at the same index. If that character turns out to be Latin1, it indicates that the input array has been modified, and the scan result may be incorrect. While a ConcurrentModificationException could be triggered, introducing the risk of an unexpected exception in an existing application is undesirable. Instead, the non-Latin1 version of the input is re-scanned and compressed. The outcome of this scan determines whether the string should be returned in its Latin1 or non-Latin1 form.

      Release note is planned to highlight this change in the JDK update releases

      Specification

      No Specification changes

            jjose Johny Jose
            webbuggrp Webbug Group
            Roger Riggs, Sean Coffey
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

              Created:
              Updated:
              Resolved: