Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8273259

Character.getName doesn't follow Unicode spec for ideographs

    XMLWordPrintable

Details

    • b15
    • generic
    • generic

    Description

      A DESCRIPTION OF THE PROBLEM :
      The Unicode spec chapter 4 at
      https://www.unicode.org/versions/Unicode13.0.0/ch04.pdf gives a naming scheme on page 182, NR2, to systematically derive names for Unicode codepoints in a set of ranges.

      This naming scheme is not followed by Character.getName. rather, most of these ranges are treated like the characters have no name, and the block based derivation rules seem to be used.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Character.getName(0x2000A)

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Return "CJK UNIFIED IDEOGRAPH-2000A"
      ACTUAL -
      Returns "CJK UNIFIED IDEOGRAPHS EXTENSION B 2000A"

      ---------- BEGIN SOURCE ----------
      Character.getName(0x2000A)
      ---------- END SOURCE ----------

      FREQUENCY : always


      Attachments

        Issue Links

          Activity

            People

              naoto Naoto Sato
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: