Details
-
Bug
-
Resolution: Fixed
-
P4
-
None
-
None
-
b86
-
Verified
Description
From https://mail.mozilla.org/pipermail/es-discuss/2010-December/012289.html :
It's far from the only extension to RegExp syntax that is common to most
implementations. In fact, the extensions are both extensive and consistent
across browsers. A quick check through the possible syntax errors show
the following:
// Invalid ControlEscape/IdentityEscape character treated as literal.
/\z/; // Invalid escape, same as /z/
// Incomplete/Invalid ControlEscape treated as either "\\c" or "c"
/\c/; // same as /c/ or /\\c/
/\c2/; // same as /c2/ or /\\c2/
// Incomplete HexEscapeSequence escape treated as either "\\x" or "x".
/\x/; // incomplete x-escape
/\x1/; // incomplete x-escape
/\x1z/; // incomplete x-escape
// Incomplete UnicodeEscapeSequence escape treated as either "\\u" or "u".
/\u/; // incomplete u-escape
/\uz/; // incomplete u-escape
/\u1/; // incomplete u-escape
/\u1z/; // incomplete u-escape
/\u12/; // incomplete u-escape
/\u12z/; // incomplete u-escape
/\u123/; // incomplete u-escape
/\u123z/; // incomplete u-escape
// Bad quantifier range:
/x{z/; // same as /x\{z/
/x{1z/; // same as /x\{1z/
/x{1,z/; // same as /x\{1,z/
/x{1,2z/; // same as /x\{1,2z/
/x{10000,20000z/; // same as /x\{10000,20000z/
// Notice: It needs arbitrary lookahead to determine the invalidity,
// except Mozilla that limits the numbers.
// Zero-initialized Octal escapes.
/\012/; // same as /\x0a/
// Nonexisting back-references treated as octal escapes:
/\5/; // same as /\x05/
// Invalid PatternCharacter accepted unescaped
/]/;
/{/;
/}/;
// Bad escapes also inside CharacterClass.
/[\z]/;
/[\c]/;
/[\c2]/;
/[\x]/;
/[\x1]/;
/[\x1z]/;
/[\u]/;
/[\uz]/;
/[\u1]/;
/[\u1z]/;
/[\u12]/;
/[\u12z]/;
/[\u123]/;
/[\u123z]/;
/[\012]/;
/[\5]/;
// And in addition:
/[\B]/;
/()()[\2]/; // Valid backreference should be invalid.
Many of these regular expressions currently throw syntax errors in Nashorn.
It's far from the only extension to RegExp syntax that is common to most
implementations. In fact, the extensions are both extensive and consistent
across browsers. A quick check through the possible syntax errors show
the following:
// Invalid ControlEscape/IdentityEscape character treated as literal.
/\z/; // Invalid escape, same as /z/
// Incomplete/Invalid ControlEscape treated as either "\\c" or "c"
/\c/; // same as /c/ or /\\c/
/\c2/; // same as /c2/ or /\\c2/
// Incomplete HexEscapeSequence escape treated as either "\\x" or "x".
/\x/; // incomplete x-escape
/\x1/; // incomplete x-escape
/\x1z/; // incomplete x-escape
// Incomplete UnicodeEscapeSequence escape treated as either "\\u" or "u".
/\u/; // incomplete u-escape
/\uz/; // incomplete u-escape
/\u1/; // incomplete u-escape
/\u1z/; // incomplete u-escape
/\u12/; // incomplete u-escape
/\u12z/; // incomplete u-escape
/\u123/; // incomplete u-escape
/\u123z/; // incomplete u-escape
// Bad quantifier range:
/x{z/; // same as /x\{z/
/x{1z/; // same as /x\{1z/
/x{1,z/; // same as /x\{1,z/
/x{1,2z/; // same as /x\{1,2z/
/x{10000,20000z/; // same as /x\{10000,20000z/
// Notice: It needs arbitrary lookahead to determine the invalidity,
// except Mozilla that limits the numbers.
// Zero-initialized Octal escapes.
/\012/; // same as /\x0a/
// Nonexisting back-references treated as octal escapes:
/\5/; // same as /\x05/
// Invalid PatternCharacter accepted unescaped
/]/;
/{/;
/}/;
// Bad escapes also inside CharacterClass.
/[\z]/;
/[\c]/;
/[\c2]/;
/[\x]/;
/[\x1]/;
/[\x1z]/;
/[\u]/;
/[\uz]/;
/[\u1]/;
/[\u1z]/;
/[\u12]/;
/[\u12z]/;
/[\u123]/;
/[\u123z]/;
/[\012]/;
/[\5]/;
// And in addition:
/[\B]/;
/()()[\2]/; // Valid backreference should be invalid.
Many of these regular expressions currently throw syntax errors in Nashorn.