Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8146270

XMLStreamReader can fail when underlyiing Reader.read(cbuf,off,len) returns 0

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Cannot Reproduce
    • Icon: P4 P4
    • None
    • 7u71
    • xml
    • x86_64
    • windows_7

      FULL PRODUCT VERSION :
      java version "1.7.0_71"
      Java(TM) SE Runtime Environment (build 1.7.0_71-b14)
      Java HotSpot(TM) 64-Bit Server VM (build 24.71-b01, mixed mode)


      ADDITIONAL OS VERSION INFORMATION :
      Microsoft Windows [Version 6.1.7601]

      A DESCRIPTION OF THE PROBLEM :
      I have a class that extends FilterReader, and I use this class to feed into an XMLStreamReader. The filter suppresses some characters obtained by an underlying reader, and occasionally this results in its read(cbuf,off,len) method returning zero.

      When this happens, the XMLStreamReader can mistakenly consume characters in the buffer starting at the offset passed in the read() call, even though no characters from that position forward constitute valid data.

      As best I can make out, the problem arises in XMLEntityScanner.scanContent(XMLString). At line 910, an attempt is made to load additional content, after copying the final existing character to the front of the buffer for the current entity. The load(...) method that is invoked, upon receiving a character count of zero from read(...), fails to update the current entity's "count" field - as it does in the case of a nonzero result - and the caller then incorrectly assumes there are more than just a single character available.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      I have put some effort into creating a simple reproducible test case for this, but I have found it difficult to do so because it seems sensitive to internal buffering. I hope that my description will enable someone much more familiar with the JAXP code base to develop a simple test case.


      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      Could be any of a number of parse errors if processing bogus input results in invalid XML. For example, my filter eliminates invalid XML characters, since my XML is from a source that produces XML with such characters in character content. In my case, I can receive XML parse errors complaining about these invalid characters, since my filter can leave the bogus characters in the supplied buffer.

      REPRODUCIBILITY :
      This bug can be reproduced rarely.

      CUSTOMER SUBMITTED WORKAROUND :
      I can avoid this bug if I ensure that my filter never returns 0 for a read(cbuf,off,len) operation. It continues to try to fill the buffer with some valid characters until it encounters EOF, at which point it returns -1.

            aefimov Aleksej Efimov
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated:
              Resolved: