Loading...

XML

Word

Printable

Type: CSR
Resolution: Approved
Priority: P3
Fix Version/s: 13
Component/s: core-libs
Labels:
None

Subcomponent:
java.nio.charsets
Compatibility Kind:

behavioral
Compatibility Risk:
low
Compatibility Risk Description:
Client code that *expects* the code point to be reported as "malformed" will not work with this change, which now is not recommended by the Unicode Consortium corrigendum.
Interface Kind:

Java API
Scope:
SE

Summary

Correct the behavior of UnicodeDecoder subclasses on handling U+FFFE code point in the middle of the input buffer.

Problem

Currently UnicodeDecoder deals with U+FFFE in the middle of a string as "malformed" as it is a non-character. This has been correct up until Unicode 7. However Unicode 7 includes the corrigendum (http://www.unicode.org/versions/corrigendum9.html) that changed the definition of non-characters. UnicodeDecoder's behavior should be modified to conform to it.

Solution

Remove the piece of code in UnicodeDecoder which detects the code point in the middle and return "malformed" CodeResult, so that the UTF16 decoders (StandardCharsets.UTF_16[LE/BE]) can pass through the code point.

Specification

As required by the Unicode 7 Corrigendum 9, U+FFFE is passed through as a code point.

csr of

JDK-8216140 Correct UnicodeDecoder U+FFFE handling

Closed

Assignee:: Naoto Sato

Reporter:: Naoto Sato

Reviewed By:: Roger Riggs

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2019-01-15 08:22

Updated:: 2019-01-15 12:04

Resolved:: 2019-01-15 11:52

Details

Description

Summary

Problem

Solution

Specification

Attachments

Issue Links

Activity

People

Dates