Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6609854

Regex does not match correctly for negative nested character classes

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Fixed
    • Icon: P3 P3
    • 9
    • 6
    • core-libs
    • None
    • b119
    • generic
    • generic

      > >> I have been looking into the definition of [character set]
      > >> expressions in Java regular expressions, to understand what needs to
      > >> be done to make ICU be compatible, or more compatible at least.
      > >>
      > >> There does not appear to be any formal definition for [set
      > >> expressions], or at least not that I can find.
      > >>
      > >> Trying tests, one aspect of the behavior seems really odd. It would
      > >> be good if we could find out from Sun whether it was really intended
      > >> to work the way that it does.
      > >>
      > >> The question concerns the negation of a set,
      > >> [^0-9], to get everything except for the ASCII digits, for example.
      > >>
      > >> In Java, the negation does _not_ apply to anything appearing in
      > >> nested [brackets]
      > >>
      > >> So [^c] does not match "c", as you would expect.
      > >> [^[c]] does match "c". Not what I would expect.
      > >> [[^c]] does not match "c"
      > >>
      > >> The same holds true for ranges or property expressions - if they're
      > >> inside brackets, a negation at an out level does not affect them.
      > >>
      > >> [^a-z] is opposite from [^[a-z]]
      > >>
      > >> And the same seems to hold for set expressions with &&, although the
      > >> cases become hard to understand.
      > >>
      > >> Perl and Posix behavior doesn't provide any guidance here, as they do
      > >> not support nested brackets at all - a '[' is not special within a
      > >> set, and just becomes yet another member of the set.

            sherman Xueming Shen
            sherman Xueming Shen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: