REGRESSION: regex character class negation error

XMLWordPrintable

    • 04
    • x86
    • linux, windows_xp
    • Verified



        Name: gm110360 Date: 06/02/2003


        FULL PRODUCT VERSION :
        java version "1.4.2-beta"
        Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2-beta-b19)
        Java HotSpot(TM) Client VM (build 1.4.2-beta-b19, mixed mode)

        FULL OS VERSION :
        Windows XP

        A DESCRIPTION OF THE PROBLEM :
        I wanted to match in a string everything, except '>'. I use the regex "[^>]" But actually it doesn't match the character "\u203A" (The HTML-character ›) as well.

        The same applies to '<' and '\u2039', the html &lsaquo; character.

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        just run the program below.

        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        using JRE1.4.1, you get the correct result (last line is important):

        C:\> c:\Programme\Java\j2re1.4.1_02\bin\java -classpath classes PatternGTTest
        Pattern '>' matches '>'
        Pattern '>' does not match '?'
        Pattern '[^>]' does not match '>'
        Pattern '[^>]' matches '?'
        ACTUAL -
        using JRE1.4.2-beta, you get an incorrect result (see last line):

        C:\> c:\Programme\Java\j2re1.4.2\bin\java -classpath classes PatternGTTest
        Pattern '>' matches '>'
        Pattern '>' does not match '?'
        Pattern '[^>]' does not match '>'
        Pattern '[^>]' does not match '?'


        REPRODUCIBILITY :
        This bug can be reproduced always.

        ---------- BEGIN SOURCE ----------
        import java.util.regex.*;

        public class PatternGTTest {
            public static void main(String[] args) throws Exception {
                checkMatch(">", ">");
                checkMatch(">", "\u203A"); // &rsaquo;
                checkMatch("[^>]", ">");
                checkMatch("[^>]", "\u203A");
            }
            public static void checkMatch(String pat, String in) {
                System.out.print("Pattern '" + pat + "'");
                Pattern p = Pattern.compile(pat);
                if (!p.matcher(in).matches()) System.out.print(" does not match ");
                else System.out.print(" matches ");
                System.out.println("'" + in + "'");
            }
        }

        ---------- END SOURCE ----------

        CUSTOMER SUBMITTED WORKAROUND :
        use java 1.4.1

        Release Regression From : 1.4.1_03
        The above release value was the last known release where this
        bug was known to work. Since then there has been a regression.

        (Review ID: 186810)
        ======================================================================

              Assignee:
              Michael Mccloskey (Inactive)
              Reporter:
              Girish Manwani (Inactive)
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: