Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6970890

Single XML char "-" in a regex char class expression

XMLWordPrintable

    • 1.4
    • generic
    • generic
    • Verified

        The specification (http://www.w3.org/TR/2004/REC-xmlschema-2-20041028/datatypes.html#regexs) states:

        [12] charClassExpr ::= '[' charGroup ']'
        [13] charGroup ::= posCharGroup | negCharGroup | charClassSub
        [14] posCharGroup ::= ( charRange | charClassEsc )+
        [17] charRange ::= seRange | XmlCharIncDash
        [18] seRange ::= charOrEsc '-' charOrEsc
        [20] charOrEsc ::= XmlChar | SingleCharEsc
        [21] XmlChar ::= [^\#x2D#x5B#x5D]
        [22] XmlCharIncDash ::= [^\#x5B#x5D]

        A single XML character is a ·character range· that identifies the set of characters containing only itself. All XML characters are valid character ranges, except as follows:

            * The [, ], - and \ characters are not valid character ranges;
            * The ^ character is only valid at the beginning of a ·positive character group· if it is part of a ·negative character group·
            * The - character is a valid character range only at the beginning or end of a ·positive character group·.

        Note: The grammar for ·character range· as given above is ambiguous, but the second and third bullets above together remove the ambiguity

        The regex 'a[-]?c' and '[-]' are valid according to the derivation rules: 12-13-14-17-22 and comments

              joehw Joe Wang
              lkuskov Leonid Kuskov
              Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

                Created:
                Updated:
                Resolved:
                Imported:
                Indexed: