Specification for JEP 368: Text Blocks (Second Preview)

This document proposes changes to The Java(R) Language Specification, Java SE 14 Edition in support of Text Blocks, a preview feature of Java SE 14.

1.5 Preview Features

3.10 Literals

3.10.4 Character Literals

A character literal is expressed as a character or an escape sequence (3.10.6), enclosed in ASCII single quotes. (The single-quote, or apostrophe, character is \u0027.)

Character literals can only represent UTF-16 code units ([3.1]), i.e., they are limited to values from \u0000 to \uffff. Supplementary characters must be represented either as a surrogate pair within a char sequence, or as an integer, depending on the API they are used with.

The content of a character literal is the SingleCharacter or the EscapeSequence which follows the opening '.

It is a compile-time error for the character following the ~~SingleCharacter or EscapeSequence~~ content to be other than a '.

It is a compile-time error for a line terminator (3.4) to appear after the opening ' and before the closing '.

The character represented a character literal is the content of the character literal with any escape sequence interpreted, as if by execution of String::translateEscapes on the content.

3.10.5 String Literals

A string literal consists of zero or more characters enclosed in double quotes. Characters such as newlines may be represented by escape sequences (3.10.7).~~- one escape sequence for characters in the range U+0000 to U+FFFF, two escape sequences for the UTF-16 surrogate code units of characters in the range U+010000 to U+10FFFF~~

The content of a string literal is the sequence of characters that begins immediately after the opening " and ends immediately before the closing matching ".

It is a compile-time error for a line terminator to appear in the content of a string literal ~~after the opening " and before the closing matching "~~.

The string represented by a string literal is the content of the string literal with every escape sequence interpreted, as if by execution of String::translateEscapes on the content.

At run time, a string literal is a reference to an instance of class String ([4.3.1], 4.3.3) that denotes the string represented by the string literal.

Moreover, a string literal always refers to the same instance of class String. This is because string literals - or, more generally, strings that are the values of constant expressions (15.28) - are “interned” so as to share unique instances, using the method String.intern (12.5).

3.10.6 Text Blocks

A text block is always of type String (4.3.3).

The opening delimiter is a sequence that starts with three double quote characters ("""), continues with zero or more space, tab, and form feed characters, and concludes with a line terminator.

The closing delimiter is a sequence of three double quote characters.

The content of a text block is the sequence of characters that begins immediately after the line terminator of the opening delimiter, and ends immediately before the first double quote of the closing delimiter.

Unlike in a string literal (3.10.5), it is not a compile-time error for a line terminator to appear in the content of a text block.

Example 3.10.6-1. Text Blocks

When multi-line strings are desired, a text block is usually more readable than a concatenation of string literals. For example, compare these alternative representations of a snippet of HTML:

String html = "<html>\n" +
              "    <body>\n" +
              "        <p>Hello, world</p>\n" +
              "    </body>\n" +
              "</html>\n";

String html = """
              <html>
                  <body>
                      <p>Hello, world</p>
                  </body>
              </html>
              """;

Here are some examples of text blocks:

String season = """
                winter""";    // the six characters w i n t e r

String period = """
                winter
                """;          // the seven characters w i n t e r LF

String greeting = 
    """
    Hi, "Bob"
    """;        // the ten characters H i , SP " B o b " LF

String salutation =
    """
    Hi,
     "Bob"
    """;        // the eleven characters H i , LF SP " B o b " LF

String empty = """
               """;      // the empty string (zero length)

String quote = """
               "
               """;      // the two characters " LF

String backslash = """
                   \\
                   """;  // the two characters \ LF

The use of the escape sequences \" and \n is permitted in a text block, but not necessary or recommended. However, representing the sequence """ in a text block requires the escaping of at least one " character, to avoid mimicking the closing delimiter.

Example 3.10.6-2. Escape sequences in text blocks

The following snippet of text would be less readable if the " characters were escaped:

String story = """
    "When I use a word," Humpty Dumpty said,
    in rather a scornful tone, "it means just what I
    choose it to mean - neither more nor less."
    "The question is," said Alice, "whether you
    can make words mean so many different things."
    "The question is," said Humpty Dumpty,
    "which is to be master - that's all."""";

If a text block is to denote another text block, then it is recommended to escape the first " of the embedded opening and closing delimiters:

String code = 
    """
    String text = \"""
        A text block inside a text block
    \""";
    """;

The string represented by a text block is not the literal sequence of characters in the content. Instead, the string represented by a text block is the result of applying the following transformations to the content, in order:

Line terminators are normalized to the ASCII LF character, as follows:
1. An ASCII CR character followed by an ASCII LF character is translated to an ASCII LF character.
2. An ASCII CR character is translated to an ASCII LF character.
Incidental white space is removed, as if by execution of String::stripIndent on the characters resulting from step 1.
Escape sequences are interpreted, as if by execution of String::translateEscapes on the characters resulting from step 2.

Example 3.10.6-3. Order of transformations on text block content

Interpreting escape sequences last allows developers to use \n, \f, and \r for vertical formatting of a string without affecting the normalization of line terminators, and to use \b and \t for horizontal formatting of a string without affecting the removal of incidental white space. For example, consider this text block that mentions the escape sequence \r (CR):

String html = """
              <html>\r
                  <body>\r
                      <p>Hello, world</p>\r
                  </body>\r
              </html>\r
              """;

The \r escapes are not interpreted until after the line terminators have been normalized to LF. Using Unicode escapes to visualize LF (\u000A) and CR (\u000D), and using | to visualize the left margin, the final result is:

|<html>\u000D\u000A
|    <body>\u000D\u000A
|        <p>Hello, world</p>\u000D\u000A
|    </body>\u000D\u000A
|</html>\u000D\u000A

When this specification says that a text block contains a particular character or sequence of characters, or that a particular character or sequence of characters is in a text block, it means that the string represented by the text block (as opposed to the content of the text block) contains the character or sequence of characters.

At run time, a text block is a reference to an instance of class String that denotes the string represented by the text block.

A text block always refers to the same instance of class String. This is because the strings represented by text blocks - or, more generally, strings that are the values of constant expressions (15.28) - are “interned” so as to share unique instances (12.5).

Example 3.10.6-4. Text blocks evaluate to strings

Text blocks can be used wherever an expression of type String is allowed, such as in string concatenation (15.18.1), in method invocation on class String, and in annotations with String elements:

System.out.println("abc" + """
                           cde
                           """);

String math = """
              1+1 equals
              """ + " " + String.valueOf(2);

String cde = """
             abcde""".substring(2);

@Precondition("""
    rate > 0 &&
    rate <= MAX_REFRESH_RATE
""")
public void setRefreshRate(int rate) { ... }

3.10.7 Escape Sequences ~~for Character and String Literals~~

In character literals (3.10.4), string literals (3.10.5), and text blocks (3.10.6), the ~~character and string~~ escape sequences allow for the representation of some nongraphic characters without using Unicode escapes (3.3), as well as the single quote, double quote, and backslash characters.

It is a compile-time error if the character following a backslash in an escape sequence is not a LineTerminator or an ASCII b, s, t, f, n, r, ", ', \, 0, 1, 2, 3, 4, 5, 6, or 7.

An escape sequence in the content of a character literal, string literal, or text block is interpreted by replacing its \ and trailing characters with the single character denoted by the Unicode escape in the EscapeSequence grammar. The line continuation escape sequence has no corresponding Unicode escape, so is interpreted by replacing it with nothing.

The line continuation escape sequence may appear in a text block, but cannot appear in a character literal (3.10.4) or a string literal (3.10.5) because each disallows a LineTerminator.

Miscellaneous changes

Legal Notice

ORACLE AMERICA, INC. IS WILLING TO LICENSE THIS SPECIFICATION TO YOU ONLY UPON THE CONDITION THAT YOU ACCEPT ALL OF THE TERMS CONTAINED IN THIS LICENSE AGREEMENT (“AGREEMENT”). PLEASE READ THE TERMS AND CONDITIONS OF THIS AGREEMENT CAREFULLY.

Specification: JSR-389 Java SE 14 (“Specification”)
Version: 14
Status: Early Draft Review
Release: December 2019

LIMITED LICENSE GRANTS

The Specification is protected by copyright and the information described therein may be protected by one or more U.S. patents, foreign patents, or pending applications. Except as provided under the following license, no part of the Specification may be reproduced in any form by any means without the prior written authorization of Oracle America, Inc. (“Oracle”) and its licensors, if any. Any use of the Specification and the information described therein will be governed by the terms and conditions of this Agreement.

Subject to the terms and conditions of this license, including your compliance with Paragraphs 1 and 2 below, Oracle hereby grants you a fully-paid, non-exclusive, non-transferable, limited license (without the right to sublicense) under Oracle’s intellectual property rights to:

Review the Specification for the purposes of evaluation. This includes: (i) developing implementations of the Specification for your internal, non-commercial use; (ii) discussing the Specification with any third party; and (iii) excerpting brief portions of the Specification in oral or written communications which discuss the Specification provided that such excerpts do not in the aggregate constitute a significant portion of the Technology.
Distribute implementations of the Specification to third parties for their testing and evaluation use, provided that any such implementation:
1. does not modify, subset, superset or otherwise extend the Licensor Name Space, or include any public or protected packages, classes, Java interfaces, fields or methods within the Licensor Name Space other than those required/authorized by the Specification or Specifications being implemented;
2. is clearly and prominently marked with the word “UNTESTED” or “EARLY ACCESS” or “INCOMPATIBLE” or “UNSTABLE” or “BETA” in any list of available builds and in proximity to every link initiating its download, where the list or link is under Licensee’s control; and
3. includes the following notice: “This is an implementation of an early-draft specification developed under the Java Community Process (JCP) and is made available for testing and evaluation purposes only. The code is not compatible with any specification of the JCP.”

The grant set forth above concerning your distribution of implementations of the specification is contingent upon your agreement to terminate development and distribution of your “early draft” implementation as soon as feasible following final completion of the specification. If you fail to do so, the foregoing grant shall be considered null and void.

No provision of this Agreement shall be understood to restrict your ability to make and distribute to third parties applications written to the Specification.

Other than this limited license, you acquire no right, title or interest in or to the Specification or any other Oracle intellectual property, and the Specification may only be used in accordance with the license terms set forth herein. This license will expire on the earlier of: (a) two (2) years from the date of Release listed above; (b) the date on which the final version of the Specification is publicly released; or (c) the date on which the Java Specification Request (JSR) to which the Specification corresponds is withdrawn. In addition, this license will terminate immediately without notice from Oracle if you fail to comply with any provision of this license. Upon termination, you must cease use of or destroy the Specification.

“Licensor Name Space” means the public class or interface declarations whose names begin with “java”, “javax”, “com.oracle” or their equivalents in any subsequent naming convention adopted by Oracle through the Java Community Process, or any recognized successors or replacements thereof.

TRADEMARKS

No right, title, or interest in or to any trademarks, service marks, or trade names of Oracle or Oracle’s licensors is granted hereunder. Oracle, the Oracle logo, and Java are trademarks or registered trademarks of Oracle America, Inc. in the U.S. and other countries.

DISCLAIMER OF WARRANTIES

THE SPECIFICATION IS PROVIDED “AS IS” AND IS EXPERIMENTAL AND MAY CONTAIN DEFECTS OR DEFICIENCIES WHICH CANNOT OR WILL NOT BE CORRECTED BY ORACLE. ORACLE MAKES NO REPRESENTATIONS OR WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT THAT THE CONTENTS OF THE SPECIFICATION ARE SUITABLE FOR ANY PURPOSE OR THAT ANY PRACTICE OR IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADE SECRETS OR OTHER RIGHTS. This document does not represent any commitment to release or implement any portion of the Specification in any product.

THE SPECIFICATION COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION THEREIN; THESE CHANGES WILL BE INCORPORATED INTO NEW VERSIONS OF THE SPECIFICATION, IF ANY. ORACLE MAY MAKE IMPROVEMENTS AND/OR CHANGES TO THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THE SPECIFICATION AT ANY TIME. Any use of such changes in the Specification will be governed by the then-current license for the applicable version of the Specification.

LIMITATION OF LIABILITY

TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL ORACLE OR ITS LICENSORS BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION, LOST REVENUE, PROFITS OR DATA, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF OR RELATED TO ANY FURNISHING, PRACTICING, MODIFYING OR ANY USE OF THE SPECIFICATION, EVEN IF ORACLE AND/OR ITS LICENSORS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.

You will hold Oracle (and its licensors) harmless from any claims based on your use of the Specification for any purposes other than the limited right of evaluation as described above, and from any claims that later versions or releases of any Specification furnished to you are incompatible with the Specification provided to you under this license.

RESTRICTED RIGHTS LEGEND

If this Software is being acquired by or on behalf of the U.S. Government or by a U.S. Government prime contractor or subcontractor (at any tier), then the Government’s rights in the Software and accompanying documentation shall be only as set forth in this license; this is in accordance with 48 C.F.R. 227.7201 through 227.7202-4 (for Department of Defense (DoD) acquisitions) and with 48 C.F.R. 2.101 and 12.212 (for non-DoD acquisitions).

REPORT

You may wish to report any ambiguities, inconsistencies or inaccuracies you may find in connection with your evaluation of the Specification (“Feedback”). To the extent that you provide Oracle with any Feedback, you hereby: (i) agree that such Feedback is provided on a non-proprietary and nonconfidential basis, and (ii) grant Oracle a perpetual, non-exclusive, worldwide, fully paid-up, irrevocable license, with the right to sublicense through multiple levels of sublicensees, to incorporate, disclose, and use without limitation the Feedback for any purpose related to the Specification and future versions, implementations, and test suites thereof.

GENERAL TERMS

Any action related to this Agreement will be governed by California law and controlling U.S. federal law. The U.N. Convention for the International Sale of Goods and the choice of law rules of any jurisdiction will not apply. The Specification is subject to U.S. export control laws and may be subject to export or import regulations in other countries. Licensee agrees to comply strictly with all such laws and regulations and acknowledges that it has the responsibility to obtain such licenses to export, re-export or import as may be required after delivery to Licensee.

This Agreement is the parties’ entire agreement relating to its subject matter. It supersedes all prior or contemporaneous oral or written communications, proposals, conditions, representations and warranties and prevails over any conflicting or additional terms of any quote, order, acknowledgment, or other communication between the parties relating to its subject matter during the term of this Agreement. No modification to this Agreement will be binding, unless in writing and signed by an authorized representative of each party.