Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8311906

Improve robustness of String constructors with mutable array inputs

    XMLWordPrintable

Details

    • b27
    • generic
    • generic
    • Verified

    Description

      A DESCRIPTION OF THE PROBLEM :
      A race condition in the String constructor taking a char[] (and probably other constructors too) allows creating a String with an incorrect coder: A String only containing latin-1 characters, but still encoded using UTF-16.
      This is because in between the constructor checking if the content can be encoded using latin-1 and it being encoded as UTF-16, the content of the passed in array may have changed

      See https://wouter.coekaerts.be/2023/breaking-string

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Concurrently modify the char[] passed into the String constructor. See example code.

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      A String where .equals and other methods behave correctly
      ACTUAL -
      A String where .equals and other methods are inconsistent with its contents

      ---------- BEGIN SOURCE ----------
      /**
       * Given a latin-1 String, creates a copy that is
       * incorrectly encoded as UTF-16.
       */
      static String breakIt(String original) {
        if (original.chars().max().orElseThrow() > 256) {
          throw new IllegalArgumentException(
              "Can only break latin-1 Strings");
        }

        char[] chars = original.toCharArray();

        // in another thread, flip the first character back
        // and forth between being encodable as latin-1 or not
        Thread thread = new Thread(() -> {
          while (!Thread.interrupted()) {
            chars[0] ^= 256;
          }
        });
        thread.start();

        // at the same time call the String constructor,
        // until we hit the race condition
        while (true) {
          String s = new String(chars);
          if (s.charAt(0) < 256 && !original.equals(s)) {
            thread.interrupt();
            return s;
          }
        }
      }


      String a = "foo";
      String b = breakIt(a);

      // they are not equal to each other
      System.out.println(a.equals(b));
      // => false

      // they do contain the same series of characters
      System.out.println(Arrays.equals(a.toCharArray(),
          b.toCharArray()));
      // => true
      ---------- END SOURCE ----------

      FREQUENCY : always


      Attachments

        Issue Links

          Activity

            People

              rriggs Roger Riggs
              webbuggrp Webbug Group
              Votes:
              1 Vote for this issue
              Watchers:
              21 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: