FULL PRODUCT VERSION :
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Linux server 2.6.35-gentoo-r5 #1 SMP PREEMPT Thu Sep 2 10:01:00 CEST 2010 x86_64 AMD Phenom(tm) II X4 945 Processor AuthenticAMD GNU/Linux
A DESCRIPTION OF THE PROBLEM :
There is a discrepancy between documentation and implementation for the handling of quotes in MessageFormat patterns. The spec states that "a QuotedString can contain arbitrary characters except single quotes", but the implementation allows pairs of single quotes inside a quoted string to denote a single quote.
Relevant parts of the documentation:
String: StringPartopt | String StringPart
StringPart: '' | ' QuotedString ' | UnquotedString
Within a String, "''" represents a single quote. A QuotedString can contain arbitrary characters except single quotes; the surrounding single quotes are removed. An UnquotedString can contain arbitrary characters except single quotes and left curly brackets.
Consider the string "'foo '''' bar'". According to the above specification, the single quotes cannot be part of a QuotedString. Therefore the String consists of three parts: one quoted string "foo ", one quoted string " bar", and a pair of single quotes in between. The output should only have a single quote.
The implementation, however, does seem to treat the whole string as a QuotedString enclosed in single quotes. The pairs of single quotes inside are treated as one escaped single quote each, even inside the quoted string.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
java.text.MessageFormat.format("'foo '''' bar'", 1)
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
"foo ' bar"
ACTUAL -
"foo '' bar"
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public class MessageFormatQuoteDiscrepancy {
public static void main(String[] args) {
String res = java.text.MessageFormat.format("'foo '''' bar'", 1);
if (!"foo ' bar".equals(res))
throw new AssertionError(res);
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
A workaround relying on the current implementation would be:
1. avoid consecutive StringParts with QuotedStrings, use a single QuotedString instead and
2. simply double any occurence of a single quote, whether in a QuotedString or not
A workaround consistent with both spec and implementation would need to:
1. avoid consecutive StringParts with single quotes
2. ensure that all single quotes are doubled and outside any QuotedString
It is not always possible to be consistent with spec and implementation. For example, a literal sequence "{'{" in the output cannot be achieved reliably. The { have to be quoted, but the two conditions stated above cannot be fulfilled simultaneously. According to the documentation, the only possible representation would be "'{''''{'", whereas the implementation allows only "'{''{'" to achieve the desired result.
I'm not sure which carries more weight, the documentation or the implementation. I consider the documentation slightly better, but adjusting the implementation might break existing applications.
java version "1.6.0_21"
Java(TM) SE Runtime Environment (build 1.6.0_21-b06)
Java HotSpot(TM) 64-Bit Server VM (build 17.0-b16, mixed mode)
ADDITIONAL OS VERSION INFORMATION :
Linux server 2.6.35-gentoo-r5 #1 SMP PREEMPT Thu Sep 2 10:01:00 CEST 2010 x86_64 AMD Phenom(tm) II X4 945 Processor AuthenticAMD GNU/Linux
A DESCRIPTION OF THE PROBLEM :
There is a discrepancy between documentation and implementation for the handling of quotes in MessageFormat patterns. The spec states that "a QuotedString can contain arbitrary characters except single quotes", but the implementation allows pairs of single quotes inside a quoted string to denote a single quote.
Relevant parts of the documentation:
String: StringPartopt | String StringPart
StringPart: '' | ' QuotedString ' | UnquotedString
Within a String, "''" represents a single quote. A QuotedString can contain arbitrary characters except single quotes; the surrounding single quotes are removed. An UnquotedString can contain arbitrary characters except single quotes and left curly brackets.
Consider the string "'foo '''' bar'". According to the above specification, the single quotes cannot be part of a QuotedString. Therefore the String consists of three parts: one quoted string "foo ", one quoted string " bar", and a pair of single quotes in between. The output should only have a single quote.
The implementation, however, does seem to treat the whole string as a QuotedString enclosed in single quotes. The pairs of single quotes inside are treated as one escaped single quote each, even inside the quoted string.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
java.text.MessageFormat.format("'foo '''' bar'", 1)
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
"foo ' bar"
ACTUAL -
"foo '' bar"
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public class MessageFormatQuoteDiscrepancy {
public static void main(String[] args) {
String res = java.text.MessageFormat.format("'foo '''' bar'", 1);
if (!"foo ' bar".equals(res))
throw new AssertionError(res);
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
A workaround relying on the current implementation would be:
1. avoid consecutive StringParts with QuotedStrings, use a single QuotedString instead and
2. simply double any occurence of a single quote, whether in a QuotedString or not
A workaround consistent with both spec and implementation would need to:
1. avoid consecutive StringParts with single quotes
2. ensure that all single quotes are doubled and outside any QuotedString
It is not always possible to be consistent with spec and implementation. For example, a literal sequence "{'{" in the output cannot be achieved reliably. The { have to be quoted, but the two conditions stated above cannot be fulfilled simultaneously. According to the documentation, the only possible representation would be "'{''''{'", whereas the implementation allows only "'{''{'" to achieve the desired result.
I'm not sure which carries more weight, the documentation or the implementation. I consider the documentation slightly better, but adjusting the implementation might break existing applications.
- relates to
-
JDK-4293229 RFE: Need better handling/documentation of single quotes in MessageFormat
- Resolved
-
JDK-7003643 [Fmt-Me] MessageFormat.toPattern produces wrong quoted string and subformat modifiers
- Closed