Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8235812

Unicode linebreak with quantifier does not match valid input

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P3
    • 15
    • None
    • core-libs
    • None

    Description

      The char class \R is defined to be \u000D\u000A|[\u000A\u000B\u000C\u000D\u0085\u2028\u2029].

      Therefore, the regex \\R{2} must match the sequence \r\n (first \R matches \r, and the second \R matches \n).
      In fact, it does not.

      import java.util.regex.*;

      public class RR {
          public static void main(String[] args) throws Throwable {
              Pattern p = Pattern.compile("\\R{2}");
              System.out.println(Boolean.toString(p.matcher("\r\r").matches()));
              System.out.println(Boolean.toString(p.matcher("\r\n").matches()));
              System.out.println(Boolean.toString(p.matcher("\n\r").matches()));
              System.out.println(Boolean.toString(p.matcher("\n\n").matches()));
              System.out.println(Boolean.toString(p.matcher("\r\n\r").matches()));
              System.out.println(Boolean.toString(p.matcher("\r\r\n").matches()));
              System.out.println(Boolean.toString(p.matcher("\r\n\n").matches()));
              System.out.println(Boolean.toString(p.matcher("\n\r\n").matches()));
              System.out.println(Boolean.toString(p.matcher("\r\n\r\n").matches()));
          }
      }

      prints the following (expected all 9 results be 'true'):

      true
      false
      true
      true
      true
      true
      true
      true
      true

      Attachments

        Issue Links

          Activity

            People

              igerasim Ivan Gerasimov
              igerasim Ivan Gerasimov
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: