Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8333920

NumberFormat integer only parsing breaks when format has suffix

XMLWordPrintable

    • Icon: CSR CSR
    • Resolution: Approved
    • Icon: P4 P4
    • 24
    • core-libs
    • None
    • behavioral
    • low
    • Hide
      The risk is deemed low.
      This is because for `parse(String)` invocations, values that would previously throw a ParseException now successfully parse.
      For `parse(String, ParsePosition)`, callers could be using the method to find where parsing failed through `ParsePosition.getErrorIndex()`. Such a value would now return -1, as parsing should succeed for these cases. Regardless, this risk is deemed low, because it is unlikely clients would be relying on this behavior, since it was originally incorrect in the first place.
      Show
      The risk is deemed low. This is because for `parse(String)` invocations, values that would previously throw a ParseException now successfully parse. For `parse(String, ParsePosition)`, callers could be using the method to find where parsing failed through `ParsePosition.getErrorIndex()`. Such a value would now return -1, as parsing should succeed for these cases. Regardless, this risk is deemed low, because it is unlikely clients would be relying on this behavior, since it was originally incorrect in the first place.
    • Java API
    • SE

      Summary

      Correct java.text.NumberFormat so that parsing values with suffixes while in integer only mode no longer incorrectly throws a ParseException.

      Problem

      java.text.NumberFormat, when parsing with isParseIntegerOnly() as true will always throw a ParseException when the format expects a suffix.

      For example,

      var failFmt = NumberFormat.getCurrencyInstance(Locale.FRANCE);
      failFmt.setParseIntegerOnly(true);
      failFmt.parse("5,25 €"); // throws ParseException
      // For reference, works with a currency prefix symbol
      var passFmt = NumberFormat.getCurrencyInstance(Locale.US);
      passFmt.setParseIntegerOnly(true);
      passFmt.parse("$5.25"); // returns 5

      Solution

      Correct java.text.NumberFormat to successfully complete a parse, even if the format expects a suffix. This would be the expected behavior, and this fix aligns the actual behavior as such.

      For example,

      var fmt = NumberFormat.getCurrencyInstance(Locale.FRANCE);
      fmt.setParseIntegerOnly(true);
      ParsePosition pp = new ParsePosition(0);
      fmt.parse("5,25 €", pp); // returns 5
      pp.getIndex(); // returns 1

      Specification

      This is mainly an implementation change to align with the existing specification. There are additional minor specification clarifications required.

      In NumberFormat.java, clarify the ParsePosition.index behavior when parsing with integer only.

      @@ -468,12 +468,11 @@
           /**
      -     * Returns true if this format will parse numbers as integers only.
      +     * Returns {@code true} if this format will parse numbers as integers only.
      +     * The {@code ParsePosition} index will be set to the position of the decimal
      +     * symbol. The exact format accepted by the parse operation is locale dependent.
            * For example in the English locale, with ParseIntegerOnly true, the
      -     * string "1234." would be parsed as the integer value 1234 and parsing
      -     * would stop at the "." character.  Of course, the exact format accepted
      -     * by the parse operation is locale dependent and determined by sub-classes
      -     * of NumberFormat.
      +     * string "123.45" would be parsed as the integer value 123.
            *

      In DecimalFormat.java, clarify the integer only parsing behavior when strict.

      @@ -2150,10 +2150,7 @@
            *   #getGroupingSize()} is not adhered to
            *   <li> {@link #isGroupingUsed()} returns {@code false}, and the grouping
            *   symbol is found
      -     *   <li> {@link #isParseIntegerOnly()} returns {@code true}, and the decimal
      -     *   separator is found
      -     *   <li> {@link #isGroupingUsed()} returns {@code true} and {@link
      -     *   #isParseIntegerOnly()} returns {@code false}, and the grouping
      +     *   <li> {@link #isGroupingUsed()} returns {@code true} and the grouping
            *   symbol occurs after the decimal separator
            *   <li> Any other characters are found, that are not the expected symbols,
            *   and are not digits that occur within the numerical portion

      Clarify the integer only parsing behavior when used with a multiplier.

      @@ -2917,7 +2931,8 @@
            * have '{@code U+2030}'.
            *
            * <P>Example: with multiplier 100, 1.23 is formatted as "123", and
      -     * "123" is parsed into 1.23.
      +     * "123" is parsed into 1.23. If {@code isParseIntegerOnly()} returns {@code true},
      +     * "123" is parsed into 1.

      In CompactNumberFormat, clarify the behavior of the ParsePosition index when integer only parsing,

      @@ -2372,6 +2352,8 @@
            * parsed as the value {@code 1234000} (1234 (integer part) * 1000
            * (thousand)) and the fractional part would be skipped.
            * The exact format accepted by the parse operation is locale dependent.
      +     * @implSpec This implementation does not set the {@code ParsePosition} index
      +     * to the position of the decimal symbol, but rather the end of the string.
            *
            * @return {@code true} if compact numbers should be parsed as integers
            *         only; {@code false} otherwise

            jlu Justin Lu
            jlu Justin Lu
            Naoto Sato
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved: