-
Bug
-
Resolution: Not an Issue
-
P4
-
None
-
8, 11, 17, 21, 22
-
generic
-
generic
A DESCRIPTION OF THE PROBLEM :
`java.util.Locale` implements [RFC 5646: Tags for Identifying Languages](https://www.rfc-editor.org/rfc/rfc5646.html), and allows the creation of a `Locale` from a language tag using `Locale.forLanguageTag(String)`. The `Locale` class also provides a `getVariant()` to extract the variant from the locale.
RFC 5646 provides examples of languages tags, one of them being `sl-rozaj-biske`, which has a variant of `rozaj-biske`. Thus if I create a `Locale` from this language tag and ask for its variant, I expect to receive `rozaj-biske`. But instead I receive `rozaj_biske`. You can try this using JUnit and Hamcrest like this:
```java
assertThat(Locale.forLanguageTag("sl-rozaj-biske").getVariant(), is("rozaj-biske"))
```
This test will fail. `Locale.getVariant()` seems to be returning some internal, old-style representation of the variant. But the Javadocs for `Locale.getVariant()` makes no mention of any special format. They say:
> Returns the variant code for this locale.
And the variant code is correctly `rozaj-biske`, not `rozaj_biske`.
Contrast this with the Javadocs for `Locale.toString()`, which do mention the underscore `_`, but only describe them as appearing _between_ components. Nothing indicates that a component itself would contain a dash converted to an underscore (and even if did, it would not necessarily apply to `Locale.getVariant()`). Note further that `Locale.toString()` even shows `x-java`, separated by a dash, in its output. Thus all indications is that each component of a `Locale` should correctly include a dash as a separator (which indeed is the correct value), not replaced with an underscore. The underscore only appears in `Locale.toString()` as a delimiter _between_ components.
Note also that if Java determines it not to be a bug to return from `Locale.geVariant()` a value that is not in fact the true variant, developers therefore have no `Locale` method for retrieving the _true_ variant from a `Locale` other than doing brute force replacements or using `Locale.toLanguageTag()` and manually parsing out the variant.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
System.out.println(Locale.forLanguageTag("sl-rozaj-biske").getVariant());
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
rozaj-biske
ACTUAL -
rozaj_biske
---------- BEGIN SOURCE ----------
assertThat(Locale.forLanguageTag("sl-rozaj-biske").getVariant(), is("rozaj-biske"));
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Locale.forLanguageTag("sl-rozaj-biske").getVariant().replace('_', '-')
FREQUENCY : always
`java.util.Locale` implements [RFC 5646: Tags for Identifying Languages](https://www.rfc-editor.org/rfc/rfc5646.html), and allows the creation of a `Locale` from a language tag using `Locale.forLanguageTag(String)`. The `Locale` class also provides a `getVariant()` to extract the variant from the locale.
RFC 5646 provides examples of languages tags, one of them being `sl-rozaj-biske`, which has a variant of `rozaj-biske`. Thus if I create a `Locale` from this language tag and ask for its variant, I expect to receive `rozaj-biske`. But instead I receive `rozaj_biske`. You can try this using JUnit and Hamcrest like this:
```java
assertThat(Locale.forLanguageTag("sl-rozaj-biske").getVariant(), is("rozaj-biske"))
```
This test will fail. `Locale.getVariant()` seems to be returning some internal, old-style representation of the variant. But the Javadocs for `Locale.getVariant()` makes no mention of any special format. They say:
> Returns the variant code for this locale.
And the variant code is correctly `rozaj-biske`, not `rozaj_biske`.
Contrast this with the Javadocs for `Locale.toString()`, which do mention the underscore `_`, but only describe them as appearing _between_ components. Nothing indicates that a component itself would contain a dash converted to an underscore (and even if did, it would not necessarily apply to `Locale.getVariant()`). Note further that `Locale.toString()` even shows `x-java`, separated by a dash, in its output. Thus all indications is that each component of a `Locale` should correctly include a dash as a separator (which indeed is the correct value), not replaced with an underscore. The underscore only appears in `Locale.toString()` as a delimiter _between_ components.
Note also that if Java determines it not to be a bug to return from `Locale.geVariant()` a value that is not in fact the true variant, developers therefore have no `Locale` method for retrieving the _true_ variant from a `Locale` other than doing brute force replacements or using `Locale.toLanguageTag()` and manually parsing out the variant.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
System.out.println(Locale.forLanguageTag("sl-rozaj-biske").getVariant());
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
rozaj-biske
ACTUAL -
rozaj_biske
---------- BEGIN SOURCE ----------
assertThat(Locale.forLanguageTag("sl-rozaj-biske").getVariant(), is("rozaj-biske"));
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Locale.forLanguageTag("sl-rozaj-biske").getVariant().replace('_', '-')
FREQUENCY : always