Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8145136

Upgrade CLDR locale data



    • b122
    • generic
    • generic


      Upgrade to the Unicode Consortium’s Common Local Data Repository (CLDR) version 29 http://cldr.unicode.org/index/downloads/cldr-29.
      The current latest CLDR is version 29: http://cldr.unicode.org/index/downloads/cldr-29.
      Unicode CLDR 29 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks.

      Highlighted below are some key features in CLDR version 28 & 29 :

      CLDR Version 28 changes:

      General locale data. Overall, about 5% of the data items in this release are new (see Growth), while about 8% have corrections. Notable changes include a major review of and improvement to Spanish locales for Latin America; the addition of two new “modern-coverage” locales (Belarusian and Irish); and moving certain data from en_GB to en_001 for improved quality and reduced data size in locales that use en_GB conventions.

      •Formatting. There are a number of new units and types of formats, with a major revision to the day period rules—preferred for many languages instead of AM/PM (“10:30 at night”)—with localizations; the addition of compact formatting for currencies (“€10M”, “€10 million”), and the addition of more unit measures, including 7 new general units (duration-century), 21 new per-unit types, 4 new units for measuring personal age (needed for some languages), and new coordinate units for formatting latitude and longitude across languages (“10°N”).

      CLDR Version 29 changes:

      BCP47 extensions. New keys for specifying transliteration and emoji presentation, and for customizing locales with region-specific settings; extra structure for complete validity testing.

      •Transliterators. Major cleanup, including BCP47 IDs for all transforms, simpler rule format, additional transforms and unit tests. •Units. New structure for choosing appropriate units based on locale and usage; new units for concentration and imperial gallons. •Locales. Added Cantonese, selected fixes to other locales.

      Upgrading to the latest CLDR version ( v29) will need the following :

          Replacing CLDR locale data XML files from v27 to v29.
          The CLDR version change would need changes to the existing CLDRConverter tool ( Existing supported features will not be modified ) based on the requirement mentioned in CLDR version 29.

      The complete set of changes with CLDR version 29 can be obtained form the link below: http://cldr.unicode.org/index/downloads/cldr-29

      Below are few set of changes in locales in JDK9+ (CLDR as default Locale Provider):

      • For cs_CZ locale, Roman numerals month abbreviation is not available (for example, 06 XII 2010 format cannot be parsed in CLDR

      • For it_IT locale, the standard currency format is '#,##0.00 ¤'

      • For nb_NO locale, getNegativePrefix() returns unicode character 'U+2212'

      • For da_DK locale, the currency symbol for dansk krone is 'kr.'

      • For ar locale , symbol for Wednesday in short format is الأربعاء in CLDR. Parsing a day in short format in Arabic locale differs in CLDR and COMPAT

      • For de_DE locale , three letter Month name for December is 'Dez.'

      • For en locale , date-time medium format pattern has a comma after date

      • For nl_NL locale , short date format pattern is dd-MM-yy

      • Root locale supports only two forms for the day names -narrow form like "S, M, T" etc. and wide form like "Sun, Mon, Tue" etc.

      • For fi_FI locale , minus sign is unicode character 'U+2212'

      • For en_AU locale, dot is appended to the abbreviated month name or day name

      • For es_PR locale, ante meridiem and post meridiem in time format is represented as a.m and p.m respectively

      • For en_GB locale 'AM/PM' symbol is am/pm (lower case)

      • For ar locale, default numbering system for Arabic locales differ between COMPAT and CLDR

      • For ar locale, localized names for month/day in arabic locales differ between COMPAT and CLDR.

      • For fr_CH locale , decimal grouping separator is a non-breaking space character (CLDR Issue : https://unicode-org.atlassian.net/browse/ICU-13006)

      • First day of the week should be defined by region, not by language. For example, locale de_DE getFirstDayOfWeek of Calendar instance will return 2 (CLDR Issue: https://unicode-org.atlassian.net/browse/CLDR-16866)

      • For de_DE locale, abbreviation for month March is März (CLDR Issue : https://unicode-org.atlassian.net/browse/CLDR-6491)

      • For en_US locale, the medium and short date-time format for 'en' is ‹{1}, {0}› (CLDR Issue: https://unicode-org.atlassian.net/browse/CLDR-4781)

      • For ja_JP locale, short display format for Heisei is 平成 (CLDR Issue: https://unicode-org.atlassian.net/browse/CLDR-477)

      • For en_GB locale , date-time long format pattern has 'at' string after date (CLDR Issue: https://unicode-org.atlassian.net/browse/CLDR-4781)


        Issue Links



              rgoel Rachna Goel (Inactive)
              naoto Naoto Sato
              0 Vote for this issue
              5 Start watching this issue