Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8318828

`Locale.getVariant()` returns different delimter than in language tag.

XMLWordPrintable

      A DESCRIPTION OF THE PROBLEM :
      `java.util.Locale` implements [RFC 5646: Tags for Identifying Languages](https://www.rfc-editor.org/rfc/rfc5646.html), and allows the creation of a `Locale` from a language tag using `Locale.forLanguageTag(String)`. The `Locale` class also provides a `getVariant()` to extract the variant from the locale.

      RFC 5646 provides examples of languages tags, one of them being `sl-rozaj-biske`, which has a variant of `rozaj-biske`. Thus if I create a `Locale` from this language tag and ask for its variant, I expect to receive `rozaj-biske`. But instead I receive `rozaj_biske`. You can try this using JUnit and Hamcrest like this:

      ```java
      assertThat(Locale.forLanguageTag("sl-rozaj-biske").getVariant(), is("rozaj-biske"))
      ```

      This test will fail. `Locale.getVariant()` seems to be returning some internal, old-style representation of the variant. But the Javadocs for `Locale.getVariant()` makes no mention of any special format. They say:

      > Returns the variant code for this locale.

      And the variant code is correctly `rozaj-biske`, not `rozaj_biske`.

      Contrast this with the Javadocs for `Locale.toString()`, which do mention the underscore `_`, but only describe them as appearing _between_ components. Nothing indicates that a component itself would contain a dash converted to an underscore (and even if did, it would not necessarily apply to `Locale.getVariant()`). Note further that `Locale.toString()` even shows `x-java`, separated by a dash, in its output. Thus all indications is that each component of a `Locale` should correctly include a dash as a separator (which indeed is the correct value), not replaced with an underscore. The underscore only appears in `Locale.toString()` as a delimiter _between_ components.

      Note also that if Java determines it not to be a bug to return from `Locale.geVariant()` a value that is not in fact the true variant, developers therefore have no `Locale` method for retrieving the _true_ variant from a `Locale` other than doing brute force replacements or using `Locale.toLanguageTag()` and manually parsing out the variant.



      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      System.out.println(Locale.forLanguageTag("sl-rozaj-biske").getVariant());

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      rozaj-biske
      ACTUAL -
      rozaj_biske

      ---------- BEGIN SOURCE ----------
      assertThat(Locale.forLanguageTag("sl-rozaj-biske").getVariant(), is("rozaj-biske"));
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Locale.forLanguageTag("sl-rozaj-biske").getVariant().replace('_', '-')

      FREQUENCY : always


            naoto Naoto Sato
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: