-
Bug
-
Resolution: Fixed
-
P3
-
1.4.0
-
beta2
-
generic
-
generic
-
Verified
Name: bsC130419 Date: 06/05/2001
java version "1.4.0-beta"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.0-beta-b65)
Java HotSpot(TM) Client VM (build 1.4.0-beta-b65, mixed mode)
The class-description JavaDoc for the Pattern class suffers
from several errors and omissions which make it difficult for
developers to use the new regex classes. They make it even
more difficult to identify bugs, since we can't know for sure
what some of the regex constructs are supposed to do. In
particular, there is no description of the conditional
construct, '(?(X)Y|Z)', and no mention at all of the first-
occurrence-of construct, 'X!'. If you post this submission
on the BugParade, I hope you will also post the specs for
those two constructs, so that we can get on with squashing
bugs.
Here is a list of problems that I have found with the
JavaDoc, in the order that they appear:
Section "Summary of regular-expression constructs"
o The "Characters" subsection doesn't list the vertical-tab
character, '\v', which is supported. (This character also
needs to be added to the descriptions of the "\s" and
"{space}" constructs.)
o In subsection "Boundary matchers", the descriptions of "^"
and "$" should state that their default behavior is the
same as "\A" and "\Z", respectively, and that they also
match line boundaries when multiline mode is set.
o In subsection "Back references", need to add the following
entry:
\Rnn Whatever the nn'th capturing group matched
o "Special constructs" should include entries for (?(X)Y),
(?(X)Y|Z), and X!. Especially that last one; you've pro-
moted the '!' to a first-class metacharacter, but didn't
tell anyone. We at least need to know that we have to
quote it with a backslash to match a literal '!'.
Section "Comparison to Perl 5"
o Under "Perl constructs not supported by this class:", I
think the first item should read, "The embedded-code
constructs (?{code}) and (??{code})". The conditionals
_are_ supported, though they work differently than in Perl
(more about that later).
o "Constructs supported by this class but not by Perl:"
should include an entry for the X! construct, along with
a complete description of what it does. I'm about 99%
sure that I've found a bug in this construct, but I would
like to see some kind of spec before I report it.
o Under "Notable differences from Perl:", the description of
Pattern's backreferences is misleading: it seems to be
saying that "\1234" will be interpreted as a reference to
capturing group #1,234. Actually, it comes out as a
reference to group #1 followed by the literal sequence
"234", no matter how many capturing groups there are. If
you want to refer back to groups 10 through 99, you have to
use the "\Rnn" form--and again, any digits after the second
one will simply be matched literally.
And finally, about those conditional constructs. In Perl, the
conditional--the "(X)" in (?(X)Y|Z)--can only be a zero-width
assertion. It may be a lookahead or lookbehind construct, or
X may be a number, representing a query as to whether group #X
matched or not. The Pattern class, on the other hand, allows
(X) to be any valid subexpression, and it doesn't seem to
support numbered "backassertions". If these differences are
intentional, then an entry to that effect needs to be added to
"Notable differences"; otherwise, another bug report is in
order.
(Review ID: 125889)
======================================================================