Copyright © 2018 Oracle America, Inc. Legal Notice
This document proposes changes to the Java Language Specification to support raw string literals. See JEP 8196004 for an overview.
(Production difficulties prevent the notes and examples which are new in this document from being colored green.)
...
The Unicode characters resulting from the lexical translations are reduced to a sequence of input elements (3.5), which are white space (3.6), comments (3.7), and tokens. The tokens are the identifiers (3.8), keywords (3.9), literals (3.10), separators (3.11), and operators (3.12) of the syntactic grammar.
Among the input elements, raw string literals (3.10.7) are special because they effectively opt out of the lexical translations. As a result, they can directly include textual fragments of other programs which themselves include Unicode escapes and other escape sequences.
Except for comments (3.7), identifiers, and the contents of character and string and raw string literals (3.10.4, 3.10.5, 3.10.7), all input elements (3.5) in a program are formed only from ASCII characters (or Unicode escapes (3.3) which result in ASCII characters).
...
If an eligible \ is followed by u, or more than one u, and the last u is not followed by four hexadecimal digits, then a compile-time error occurs the eligible \ and all the u characters which follow are treated as RawInputCharacters and remain part of the escaped Unicode stream. If the third step of lexical translation (3.5) results in these RawInputCharacters becoming part of an input element that is not a raw string literal (3.10.7), then a compile-time error occurs.
Thus, this is legal:
String tm = "The \u2122 symbol";
But the following code, which truncates the Unicode escape, is not legal:
String tm = "The \u212 symbol";
Raw string literals are unique in that they avoid Unicode escape processing. The string literal:
"\\u2122=\u2122"
represents a string of nine characters: (TM is intended to indicate the trademark symbol)
\ \ u 2 1 2 2 = TM
whereas the raw string literal:
`\\u2122=\u2122`
represents a string of 14 characters:
\ \ u 2 1 2 2 = \ u 2 1 2 2
Since raw string literals do not contain Unicode escapes that could be considered truncated, this is legal:
String tm = `The \u212 symbol`;
but the comment in the following code is not legal, since it contains a truncated Unicode escape outside the raw string literal:
String tm = `The \u212 symbol`; // We use \u212 because ...
Literal:
IntegerLiteral
FloatingPointLiteral
BooleanLiteral
CharacterLiteral
StringLiteral
RawStringLiteral
NullLiteral
A raw string literal consists of one or more characters enclosed in ASCII backtick characters. Characters that would be represented with escape sequences (3.10.6) in a string literal, such as newlines and double quotes, can be represented directly in a raw string literal. A raw string literal can also represent character sequences that would denote Unicode escapes anywhere else in the program; this facility causes the string represented by a raw string literal to be derived in a unique manner.
RawStringLiteral:
RawStringDelimiter RawStringBody RawStringDelimiter
RawStringDelimiter:
` {`}
RawStringBody:
UnicodeInputCharacter {UnicodeInputCharacter}
It is a compile-time error if any backtick character in a RawStringDelimiter was lexically translated from the Unicode escape \u0060.
It is undesirable to allow the Unicode escape \u0060 (`) to serve as the opening delimiter because the same six-character sequence cannot serve as the closing delimiter. Thus, the following is illegal:
String s = \u0060Hi Bob`;
This boundary between the "outside" and the "inside" of a raw string literal is the only place in the Java programming language where a Unicode escape is disallowed.
The delimiters of a raw string literal must be balanced. It is a compile-time error if the opening RawStringDelimiter is not identical to the closing RawStringDelimiter.
The body of a raw string literal is the sequence of input characters and line terminators that served as input to the third step of lexical translation (3.5) in order to yield the RawStringBody of the literal. However, the string represented by a raw string literal is not the body. Instead, the string represented by a raw string literal is based on the sequence of raw Unicode characters that served as input to the first step of lexical translation (3.2) and subsequently became the body after the first and second steps. In particular, the string is the sequence of raw Unicode characters with the following translations applied, in order:
an ASCII CR character followed by an ASCII LF character is translated to an ASCII LF character.
an ASCII CR character is translated to an ASCII LF character.
Examples of raw string literals:
`raw` // the three characters r a w
`Hi, "Bob".` // the ten characters H i , SP " B o b " .
`\(.\)\1` // the seven characters \ ( . \ ) \ 1
`Hi, Bob ` // the nine characters H i , LF SP B o b LF
```````````` Hello, world ```````````` // the 14 characters LF H e l l o , SP w o r l d LF
`\n` // the two characters \ n (not LF) `\uvw` // the four characters \ u v w (not a Unicode escape) `\u0060` // the six characters \ u 0 0 6 0 (not `) `\u000a` // the six characters \ u 0 0 0 a (not LF) `\u000d\u000a` // the 12 characters \ u 0 0 0 d \ u 0 0 0 a
When this specification says that a raw string literal contains a particular character or sequence of characters, or that a particular character or sequence of characters is in a raw string literal, it means that the string represented by the raw string literal (as opposed to the body of the raw string literal) contains the character or sequence of characters.
A raw string literal may contain a backtick character in any position except the beginning or the end.
The lexical grammar implies that the string represented by a raw string literal is non-empty, and does not begin or end with a backtick character. Denoting an empty raw string literal:
String s = ``; // Illegal
is not possible because the backticks are interpreted as the opening delimiter of a raw string literal that does not finish before the end of the compilation unit. Beginning the string with a backtick:
String s = `` is the backtick character`;
is not possible for the same reason. Ending the string with a backtick is not possible because the delimiters are unbalanced:
String s = `Don't forget the backtick character ``;
If a string must begin or end with a backtick, then a padding character must be prepended or appended to the raw string literal to separate the backtick from the delimiters:
String s = "`" + ` is the backtick character, and so is ` + "`";
The number of backtick characters in the opening delimiter (and thus, in the closing delimiter) must be chosen with regard for the presence of backtick characters in the raw string literal. If a raw string literal contains a sequence of one or more backtick characters preceded and followed by non-backtick characters, then the length of the sequence must be different than the number of backticks in the opening delimiter, or a compile-time error occurs.
Examples of raw string literals that contain backticks:
`Hi, ``Bob`` and ```Jim```.`
``Hi, `Bob` and ```Jim```.``
```Hi, `Bob` and ``Jim``.```
`` ` `` // the three characters SP ` SP
``` `` ``` // the four characters SP ` ` SP
A raw string literal is always of type String (4.3.3).
At run time, a raw string literal is evaluated to a reference to an instance of type String that corresponds to the string represented by the raw string literal. Raw string literals are interned in the same manner as string literals.
Raw string literals can be used wherever an instance of String is allowed, such as in the string concatenation operator (15.18.1) and when calling methods of String:
System.out.println("abc" + `cde`);
`1+1 is ` + String.valueOf(2)
String cde = `abcde`.substring(2);
ORACLE AMERICA, INC. IS WILLING TO LICENSE THIS SPECIFICATION TO YOU ONLY UPON THE CONDITION THAT YOU ACCEPT ALL OF THE TERMS CONTAINED IN THIS LICENSE AGREEMENT ("AGREEMENT"). PLEASE READ THE TERMS AND CONDITIONS OF THIS AGREEMENT CAREFULLY.
Specification: JSR-384 Java SE 11 (18.9) ("Specification")
Version: 11
Status: Early Draft Review
Release: February 2018
Copyright © 1997, 2018, Oracle America, Inc.
500 Oracle Parkway, Redwood City, California 94065, U.S.A.
All rights reserved.
The Specification is protected by copyright and the information described therein may be protected by one or more U.S. patents, foreign patents, or pending applications. Except as provided under the following license, no part of the Specification may be reproduced in any form by any means without the prior written authorization of Oracle America, Inc. ("Oracle") and its licensors, if any. Any use of the Specification and the information described therein will be governed by the terms and conditions of this Agreement.
Subject to the terms and conditions of this license, including your compliance with Paragraphs 1 and 2 below, Oracle hereby grants you a fully-paid, non-exclusive, non-transferable, limited license (without the right to sublicense) under Oracle's intellectual property rights to:
Review the Specification for the purposes of evaluation. This includes: (i) developing implementations of the Specification for your internal, non-commercial use; (ii) discussing the Specification with any third party; and (iii) excerpting brief portions of the Specification in oral or written communications which discuss the Specification provided that such excerpts do not in the aggregate constitute a significant portion of the Technology.
Distribute implementations of the Specification to third parties for their testing and evaluation use, provided that any such implementation:
does not modify, subset, superset or otherwise extend the Licensor Name Space, or include any public or protected packages, classes, Java interfaces, fields or methods within the Licensor Name Space other than those required/authorized by the Specification or Specifications being implemented;
is clearly and prominently marked with the word "UNTESTED" or "EARLY ACCESS" or "INCOMPATIBLE" or "UNSTABLE" or "BETA" in any list of available builds and in proximity to every link initiating its download, where the list or link is under Licensee's control; and
includes the following notice: "This is an implementation of an early-draft specification developed under the Java Community Process (JCP) and is made available for testing and evaluation purposes only. The code is not compatible with any specification of the JCP."
The grant set forth above concerning your distribution of implementations of the specification is contingent upon your agreement to terminate development and distribution of your "early draft" implementation as soon as feasible following final completion of the specification. If you fail to do so, the foregoing grant shall be considered null and void.
No provision of this Agreement shall be understood to restrict your ability to make and distribute to third parties applications written to the Specification.
Other than this limited license, you acquire no right, title or interest in or to the Specification or any other Oracle intellectual property, and the Specification may only be used in accordance with the license terms set forth herein. This license will expire on the earlier of: (a) two (2) years from the date of Release listed above; (b) the date on which the final version of the Specification is publicly released; or (c) the date on which the Java Specification Request (JSR) to which the Specification corresponds is withdrawn. In addition, this license will terminate immediately without notice from Oracle if you fail to comply with any provision of this license. Upon termination, you must cease use of or destroy the Specification.
"Licensor Name Space" means the public class or interface declarations whose names begin with "java", "javax", "com.oracle" or their equivalents in any subsequent naming convention adopted by Oracle through the Java Community Process, or any recognized successors or replacements thereof.
No right, title, or interest in or to any trademarks, service marks, or trade names of Oracle or Oracle's licensors is granted hereunder. Oracle, the Oracle logo, and Java are trademarks or registered trademarks of Oracle America, Inc. in the U.S. and other countries.
THE SPECIFICATION IS PROVIDED "AS IS" AND IS EXPERIMENTAL AND MAY CONTAIN DEFECTS OR DEFICIENCIES WHICH CANNOT OR WILL NOT BE CORRECTED BY ORACLE. ORACLE MAKES NO REPRESENTATIONS OR WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO, WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT THAT THE CONTENTS OF THE SPECIFICATION ARE SUITABLE FOR ANY PURPOSE OR THAT ANY PRACTICE OR IMPLEMENTATION OF SUCH CONTENTS WILL NOT INFRINGE ANY THIRD PARTY PATENTS, COPYRIGHTS, TRADE SECRETS OR OTHER RIGHTS. This document does not represent any commitment to release or implement any portion of the Specification in any product.
THE SPECIFICATION COULD INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS. CHANGES ARE PERIODICALLY ADDED TO THE INFORMATION THEREIN; THESE CHANGES WILL BE INCORPORATED INTO NEW VERSIONS OF THE SPECIFICATION, IF ANY. ORACLE MAY MAKE IMPROVEMENTS AND/OR CHANGES TO THE PRODUCT(S) AND/OR THE PROGRAM(S) DESCRIBED IN THE SPECIFICATION AT ANY TIME. Any use of such changes in the Specification will be governed by the then-current license for the applicable version of the Specification.
TO THE EXTENT NOT PROHIBITED BY LAW, IN NO EVENT WILL ORACLE OR ITS LICENSORS BE LIABLE FOR ANY DAMAGES, INCLUDING WITHOUT LIMITATION, LOST REVENUE, PROFITS OR DATA, OR FOR SPECIAL, INDIRECT, CONSEQUENTIAL, INCIDENTAL OR PUNITIVE DAMAGES, HOWEVER CAUSED AND REGARDLESS OF THE THEORY OF LIABILITY, ARISING OUT OF OR RELATED TO ANY FURNISHING, PRACTICING, MODIFYING OR ANY USE OF THE SPECIFICATION, EVEN IF ORACLE AND/OR ITS LICENSORS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
You will hold Oracle (and its licensors) harmless from any claims based on your use of the Specification for any purposes other than the limited right of evaluation as described above, and from any claims that later versions or releases of any Specification furnished to you are incompatible with the Specification provided to you under this license.
If this Software is being acquired by or on behalf of the U.S. Government or by a U.S. Government prime contractor or subcontractor (at any tier), then the Government's rights in the Software and accompanying documentation shall be only as set forth in this license; this is in accordance with 48 C.F.R. 227.7201 through 227.7202-4 (for Department of Defense (DoD) acquisitions) and with 48 C.F.R. 2.101 and 12.212 (for non-DoD acquisitions).
You may wish to report any ambiguities, inconsistencies or inaccuracies you may find in connection with your evaluation of the Specification ("Feedback"). To the extent that you provide Oracle with any Feedback, you hereby: (i) agree that such Feedback is provided on a non-proprietary and nonconfidential basis, and (ii) grant Oracle a perpetual, non-exclusive, worldwide, fully paid-up, irrevocable license, with the right to sublicense through multiple levels of sublicensees, to incorporate, disclose, and use without limitation the Feedback for any purpose related to the Specification and future versions, implementations, and test suites thereof.
Any action related to this Agreement will be governed by California law and controlling U.S. federal law. The U.N. Convention for the International Sale of Goods and the choice of law rules of any jurisdiction will not apply. The Specification is subject to U.S. export control laws and may be subject to export or import regulations in other countries. Licensee agrees to comply strictly with all such laws and regulations and acknowledges that it has the responsibility to obtain such licenses to export, re-export or import as may be required after delivery to Licensee.
This Agreement is the parties' entire agreement relating to its subject matter. It supersedes all prior or contemporaneous oral or written communications, proposals, conditions, representations and warranties and prevails over any conflicting or additional terms of any quote, order, acknowledgment, or other communication between the parties relating to its subject matter during the term of this Agreement. No modification to this Agreement will be binding, unless in writing and signed by an authorized representative of each party.