Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8009230

Nashorn rejects extended RegExp syntax accepted by all major JS engines

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P4
    • 8
    • None
    • core-libs
    • None
    • b86
    • Verified

    Description

      From https://mail.mozilla.org/pipermail/es-discuss/2010-December/012289.html :

      It's far from the only extension to RegExp syntax that is common to most
      implementations. In fact, the extensions are both extensive and consistent
      across browsers. A quick check through the possible syntax errors show
      the following:

      // Invalid ControlEscape/IdentityEscape character treated as literal.
         /\z/; // Invalid escape, same as /z/
      // Incomplete/Invalid ControlEscape treated as either "\\c" or "c"
         /\c/; // same as /c/ or /\\c/
         /\c2/; // same as /c2/ or /\\c2/
      // Incomplete HexEscapeSequence escape treated as either "\\x" or "x".
         /\x/; // incomplete x-escape
         /\x1/; // incomplete x-escape
         /\x1z/; // incomplete x-escape
      // Incomplete UnicodeEscapeSequence escape treated as either "\\u" or "u".
         /\u/; // incomplete u-escape
         /\uz/; // incomplete u-escape
         /\u1/; // incomplete u-escape
         /\u1z/; // incomplete u-escape
         /\u12/; // incomplete u-escape
         /\u12z/; // incomplete u-escape
         /\u123/; // incomplete u-escape
         /\u123z/; // incomplete u-escape
      // Bad quantifier range:
         /x{z/; // same as /x\{z/
         /x{1z/; // same as /x\{1z/
         /x{1,z/; // same as /x\{1,z/
         /x{1,2z/; // same as /x\{1,2z/
         /x{10000,20000z/; // same as /x\{10000,20000z/
      // Notice: It needs arbitrary lookahead to determine the invalidity,
      // except Mozilla that limits the numbers.

      // Zero-initialized Octal escapes.
         /\012/; // same as /\x0a/

      // Nonexisting back-references treated as octal escapes:
         /\5/; // same as /\x05/

      // Invalid PatternCharacter accepted unescaped
         /]/;
         /{/;
         /}/;

      // Bad escapes also inside CharacterClass.
         /[\z]/;
         /[\c]/;
         /[\c2]/;
         /[\x]/;
         /[\x1]/;
         /[\x1z]/;
         /[\u]/;
         /[\uz]/;
         /[\u1]/;
         /[\u1z]/;
         /[\u12]/;
         /[\u12z]/;
         /[\u123]/;
         /[\u123z]/;
         /[\012]/;
         /[\5]/;
      // And in addition:
         /[\B]/;
         /()()[\2]/; // Valid backreference should be invalid.

      Many of these regular expressions currently throw syntax errors in Nashorn.

      Attachments

        Activity

          People

            hannesw Hannes Wallnoefer
            hannesw Hannes Wallnoefer
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: