Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8273259

Character.getName doesn't follow Unicode spec for ideographs

XMLWordPrintable

    • b15
    • generic
    • generic

      A DESCRIPTION OF THE PROBLEM :
      The Unicode spec chapter 4 at
      https://www.unicode.org/versions/Unicode13.0.0/ch04.pdf gives a naming scheme on page 182, NR2, to systematically derive names for Unicode codepoints in a set of ranges.

      This naming scheme is not followed by Character.getName. rather, most of these ranges are treated like the characters have no name, and the block based derivation rules seem to be used.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Character.getName(0x2000A)

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      Return "CJK UNIFIED IDEOGRAPH-2000A"
      ACTUAL -
      Returns "CJK UNIFIED IDEOGRAPHS EXTENSION B 2000A"

      ---------- BEGIN SOURCE ----------
      Character.getName(0x2000A)
      ---------- END SOURCE ----------

      FREQUENCY : always


            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: