-
Bug
-
Resolution: Incomplete
-
P4
-
None
-
8u162
-
x86_64
-
windows_10
A DESCRIPTION OF THE PROBLEM :
Java encodes its strings as UTF-16 characters. This means in most cases a single symbol can fit within 1 char (16 bit). But when the java string is to represent a unicode symbol which is encoded as 32 bit in UTF-16, it will be stored as two consecutive chars in the string.
HTMLEditorKit treats this as 2 characters instead of combining it into one when using write method.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1.Create HTMLDocument\
2. Insert this string into its body "ð ÂÂð Â±ð Â¹ð ±Â𠱸ð ²Âð ³Â"
3. Save it using HTMLEditor write method
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Html contains "ð ÂÂð Â±ð Â¹ð ±Â𠱸ð ²Âð ³Â"
ACTUAL -
Html contains 14 dummy symbols
Java encodes its strings as UTF-16 characters. This means in most cases a single symbol can fit within 1 char (16 bit). But when the java string is to represent a unicode symbol which is encoded as 32 bit in UTF-16, it will be stored as two consecutive chars in the string.
HTMLEditorKit treats this as 2 characters instead of combining it into one when using write method.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
1.Create HTMLDocument\
2. Insert this string into its body "ð ÂÂð Â±ð Â¹ð ±Â𠱸ð ²Âð ³Â"
3. Save it using HTMLEditor write method
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Html contains "ð ÂÂð Â±ð Â¹ð ±Â𠱸ð ²Âð ³Â"
ACTUAL -
Html contains 14 dummy symbols