-
Enhancement
-
Resolution: Duplicate
-
P4
-
None
-
7
-
generic
-
generic
A DESCRIPTION OF THE REQUEST :
Properties files use an ISO-8859-1 encoding with \u escapes for additional unicode characters. There doesn't appear to be a standard charset that supports this, which makes it difficult to generate properties files directly using PrintWriter, for example. However there's presumably some code that does this in the native2ascii tool.
JUSTIFICATION :
It's a bit odd that there isn't an obvious way to generate a valid properties file that contains non-Basic Latin characters.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Additional entry in java.nio.charsets.StandardCharsets for an ISO-8859-1 variant with \u escapes.
ACTUAL -
No standard charset available.
---------- BEGIN SOURCE ----------
package charsets;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.nio.charset.StandardCharsets;
public class OutputText
{
public static void main(String[] args)
throws FileNotFoundException, UnsupportedEncodingException
{
PrintWriter writer =
new PrintWriter(args[0], StandardCharsets.ISO_8859_1.name());
writer.print("\u2019");
writer.close();
}
}
Run this supplying a filename such as /tmp/TestOutput.txt as an argument. Inspect the file and you'll see a single ? character, clearly not 2019, which is actually a right single quotation mark. Similarly if the charset is US_ASCII.
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
You need to provide your own routine for converting appropriate characters to \u escapes. I did this by processing each character individually - although I ignored supplementary planes.
Properties files use an ISO-8859-1 encoding with \u escapes for additional unicode characters. There doesn't appear to be a standard charset that supports this, which makes it difficult to generate properties files directly using PrintWriter, for example. However there's presumably some code that does this in the native2ascii tool.
JUSTIFICATION :
It's a bit odd that there isn't an obvious way to generate a valid properties file that contains non-Basic Latin characters.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Additional entry in java.nio.charsets.StandardCharsets for an ISO-8859-1 variant with \u escapes.
ACTUAL -
No standard charset available.
---------- BEGIN SOURCE ----------
package charsets;
import java.io.FileNotFoundException;
import java.io.PrintWriter;
import java.io.UnsupportedEncodingException;
import java.nio.charset.StandardCharsets;
public class OutputText
{
public static void main(String[] args)
throws FileNotFoundException, UnsupportedEncodingException
{
PrintWriter writer =
new PrintWriter(args[0], StandardCharsets.ISO_8859_1.name());
writer.print("\u2019");
writer.close();
}
}
Run this supplying a filename such as /tmp/TestOutput.txt as an argument. Inspect the file and you'll see a single ? character, clearly not 2019, which is actually a right single quotation mark. Similarly if the charset is US_ASCII.
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
You need to provide your own routine for converting appropriate characters to \u escapes. I did this by processing each character individually - although I ignored supplementary planes.
- duplicates
-
JDK-8043553 JEP 226: UTF-8 Property Resource Bundles
-
- Closed
-
- relates to
-
JDK-4919638 Unicode property files for PropertyResourceBundles - again
-
- Closed
-