Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6173522

\Q...\E should be a regexp-compile-time, not regexp-run-time, regular expression construct

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 6
    • 6
    • core-libs
    • None
    • b14
    • generic
    • generic
    • Verified

      Regular expressions can use the construct \Q...\E to quote characters so that they are interpreted literally, rather than regular expression metacharacters, as with perl quotemeta (see perlfunc(1))

      But in perl the \Q...\E construct is not actually part of the runtime
      representation, but merely a convenience for constructing the string
      that is used as input to the regular expression compiler.

      Java should do the same.

      Currently, the construct \Q...\E is compiled by the regexp compiler in
      Pattern.java into a "Slice" node
      internally, which only matches the exact quoted sequence of the characters
      within the slice. But this does not make sense in all contexts, in
      particular within a character class.

      For example, the regular expression
      [\Qabc\Edef]
      matches
      "e"
      but not
      "a"

      An initial pass should convert the regular expression
      to [abcdef]

      Similarly,
      [\Q[]\E]
      should match the string
      "["
      but currently does not.

      Changing the \Q...\E construct to be regexp-compile time also results in
      performance improvements. For example, only one internal "Slice"
      node is created for a construct like

      abc\Qdef\E

      instead of two, since it is treated like:

      abcdef
      ###@###.### 10/25/04 18:57 GMT
      ###@###.### 10/26/04 04:34 GMT

            martin Martin Buchholz
            martin Martin Buchholz
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: