Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4803179

java.utils.regex.Matcher.appendReplacement replacement string shouldn't allow $g

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 5.0
    • 1.4.1
    • core-libs
    • tiger
    • x86
    • solaris_7



      Name: nt126004 Date: 01/14/2003


      FULL PRODUCT VERSION :
      java version "1.4.1_01"
      Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.1_01-b01)
      Java HotSpot(TM) Client VM (build 1.4.1_01-b01, mixed mode)

      FULL OPERATING SYSTEM VERSION :
      SunOS 5.7 Generic_106542-22 i86pc i386 i86p



      A DESCRIPTION OF THE PROBLEM :
      Instead of keeping the cleansiness and robustness of Java
      regex vs. Perl programming, this method contains a "trap",
      namely the coding of capture groups in the replacement
      string.
      appendReplacement is meant to replace with calculated
      replacement strings (as opposed to hand-typed) - otherwise
      methods like replaceAll or replaceFirst should be used.
      The benefit of using the Perl-like technique of the $g
      shortcut is merely to save a little writing (namely
      replacing $2 by " + matcher.group(2) + ").

      The drawback and complication (I call it a trap) you have
      introduced in this method disparages the robustness of the
      java.utils.regex package: namely, in a calculated
      replacement string, one must parse for $ characters and
      escape them - that's a special, error-prone treatment,
      likely to be forgotten and to provoke ununderstandable
      runtime bugs (as the replacement string might change every
      time the program runs, and sometimes happen to contain a $
      sign).


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      The $g feature is a non-feature: I claim the only benefit
      is to save a few characters' typing. It however is the
      source of bugs 4497669 4618713 4621239 4684543 4509697.

      My proposal is to abandon this feature.

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      String regex = "cat";
      CharSequence input = "one cat two cats in the yard";
      Pattern p = Pattern.compile(regex);
      Matcher matcher = p.matcher(input);
      StringBuffer sb = new StringBuffer();

      while (matcher.find()) {
          matcher.appendReplacement(sb, replacementMethod(matcher.group)));
          }
          matcher.appendTail(sb);
      ---------- END SOURCE ----------

      CUSTOMER WORKAROUND :
      Either you can parse and replace all $ in the replacement
      string by \$ before replacing, or you can program
      the "clean" ($g-feature-less) appendReplacement yourself:

                  int append = 0;
                  while (matcher.find()) {
                      sb.append(input.subSequence(append,
      matcher.start()));
                      append = matcher.end();
                      sb.append(replacementMethod(matcher.group
      ()));
                  }
                  sb.append(input.subSequence(append, input.length
      ()));
      (Review ID: 178859)
      ======================================================================

            mmcclosksunw Michael Mccloskey (Inactive)
            nthompsosunw Nathanael Thompson (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: