-
Bug
-
Resolution: Fixed
-
P5
-
8
JSR 335 made this observation in 15.27:
--
This syntax introduces some new parsing challenges, although they are similar in scope to what is already handled by the Java grammar.
Java has always had an ambiguity between types and expressions after a '(' token (what follows may be a cast or a parenthesized expression). This was made worse in Java 5, which reused the binary operators '<' and '>' in types.
Lambda expressions introduce a new possibility: the tokens following '(' may describe a type, an expression, or a lambda parameter list. Some tokens (annotations, final) are unique to parameter lists, while in other cases there are certain patterns that must be interpreted as parameter lists (two names in a row, a ',' not nested inside of '<' and '>'). And sometimes the ambiguity cannot be resolved until a '->' is encountered, after a ')'. The simplest way to think of how this might be efficiently parsed is with a state machine: each state represents a subset of possible interpretations (type, expression, or parameters), and when the machine transitions to a state in which the set is a singleton, the parser knows which case it is. This does not map very elegantly to a fixed-lookahead grammar, however.
--
The author of ANTLR observes that "nondeterminism ala NFA vs DFA is probably a better term than ambiguity. Ambiguity is probably better used to describe inputs that can be matched by the grammar more than one way. Nondeterminism on the other hand describes sequences that cannot be distinguished with fixed look ahead."
--
This syntax introduces some new parsing challenges, although they are similar in scope to what is already handled by the Java grammar.
Java has always had an ambiguity between types and expressions after a '(' token (what follows may be a cast or a parenthesized expression). This was made worse in Java 5, which reused the binary operators '<' and '>' in types.
Lambda expressions introduce a new possibility: the tokens following '(' may describe a type, an expression, or a lambda parameter list. Some tokens (annotations, final) are unique to parameter lists, while in other cases there are certain patterns that must be interpreted as parameter lists (two names in a row, a ',' not nested inside of '<' and '>'). And sometimes the ambiguity cannot be resolved until a '->' is encountered, after a ')'. The simplest way to think of how this might be efficiently parsed is with a state machine: each state represents a subset of possible interpretations (type, expression, or parameters), and when the machine transitions to a state in which the set is a singleton, the parser knows which case it is. This does not map very elegantly to a fixed-lookahead grammar, however.
--
The author of ANTLR observes that "nondeterminism ala NFA vs DFA is probably a better term than ambiguity. Ambiguity is probably better used to describe inputs that can be matched by the grammar more than one way. Nondeterminism on the other hand describes sequences that cannot be distinguished with fixed look ahead."