Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5098443

using \P with java.lang.character classes in regex throws StringIndexOutofBound

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 6
    • 5.0
    • core-libs
    • b14
    • generic, x86
    • generic, windows_2000
    • Verified

      Tried on Solaris-9 JDK 1.5.0-beta3-b58.

      TestCase :
      import java.util.regex.*;

      public class Test1 {
         
         public void check(String str1, String str2) {
            try {
                  Pattern p = Pattern.compile(str1);
                  Matcher m = p.matcher(str2);
                  while(m.find()) {
                      System.out.println(m.group());
                  }
            }catch(Exception e) {
               e.printStackTrace();
            }
         }
         
         public static void main(String args[]) {
                   
             Test1 ref = new Test1();
             ref.check(args[0], args[1]);
         }

      }

      Execute :

      java Test1 "\P{javaUpperCase}" "J@v" Will work fine and gives the output as
      @
      v

      But when we execute with the following

      java Test1 "\p{javaUpperCase}\P{javaUpperCase}+" "J@v" will throw the following exception..
      java.lang.StringIndexOutOfBoundsException: String index out of range: 3
      at java.lang.String.charAt(String.java:558)
      at java.util.regex.Pattern.countChars(Pattern.java:2791)
      at java.util.regex.Pattern.access$000(Pattern.java:595)
      at java.util.regex.Pattern$Not.match(Pattern.java:3764)
      at java.util.regex.Pattern$Curly.match0(Pattern.java:4222)
      at java.util.regex.Pattern$Curly.match(Pattern.java:4196)
      at java.util.regex.Pattern$JavaTypeClass.match(Pattern.java:3595)
      at java.util.regex.Pattern$Start.match(Pattern.java:3019)
      at java.util.regex.Matcher.search(Matcher.java:1092)
      at java.util.regex.Matcher.find(Matcher.java:528)
      at Test1.check(Test1.java:9)
      at Test1.main(Test1.java:20)
       
      The doc does not mention the behaviour when using P with java.lang.Character classes in regex.
      Is the behaviour same as mentioned under Unicode.

            martin Martin Buchholz
            savadhansunw Seetharama Avadhanam (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: