Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8145894

XMLStreamReaderImpl fails to return valid document encoding

    XMLWordPrintable

Details

    Description

      FULL PRODUCT VERSION :
      java version "1.8.0_45"
      Java(TM) SE Runtime Environment (build 1.8.0_45-b14)
      Java HotSpot(TM) 64-Bit Server VM (build 25.45-b02, mixed mode)


      ADDITIONAL OS VERSION INFORMATION :
      Linux AMDC1917 3.13.0-52-generic #86-Ubuntu SMP Mon May 4 04:32:59 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

      EXTRA RELEVANT SYSTEM CONFIGURATION :
      Simple Java StAX issue, tested with latest openjdk 7 and latest java 8

      A DESCRIPTION OF THE PROBLEM :
      When parsing an XML document with XMLInputFactory (javax.xml.stream.XMLInputFactory) using XMLEventReader (javax.xml.stream.XMLEventReader), the StartDocument event fails to return the valid encoding value specified in the XML document's header.

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Taking any XML file with header:

      <?xml version="1.0" encoding="iso-8859-15" standalone="yes"?>
      <content/>


      This minimal code:

      XMLInputFactory factory_in = XMLInputFactory.newInstance();
      FileInputStream in = new FileInputStream("read_in.xml");
      InputStreamReader r = new InputStreamReader(in);
      XMLEventReader = factory_in.createXMLEventReader (r);

      while(eventReader.hasNext()){
          event_in = eventReader.nextEvent();
          if(event_in.getEventType() == XMLStreamConstants.START_DOCUMENT) {
              StartDocument aSD = (StartDocument) event_in;
              aSD.getCharacterEncodingScheme(); // THIS RETURNS NULL
      } }


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      aSD.getCharacterEncodingScheme() != null && aSD.getCharacterEncodingScheme().compareTo("iso-8859-15")==0
      ACTUAL -
      aSD.getCharacterEncodingScheme() == null

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import java.io.FileInputStream;
      import java.io.IOException;
      import java.io.InputStreamReader;

      import javax.xml.stream.XMLEventFactory;
      import javax.xml.stream.XMLEventReader;
      import javax.xml.stream.XMLInputFactory;
      import javax.xml.stream.XMLStreamConstants;
      import javax.xml.stream.XMLStreamException;
      import javax.xml.stream.events.StartDocument;
      import javax.xml.stream.events.XMLEvent;

      ...

          public testEncodingValue() {
              XMLInputFactory factory_in = null;
              XMLOutputFactory factory_out = null;
              XMLEventReader eventReader = null;
              XMLEventFactory eventFactory = null;
              XMLEventWriter eventWriter = null;
              XMLEvent event_out = null;
              XMLEvent event_in = null;

              try {
                  factory_in = XMLInputFactory.newInstance();
                  FileInputStream in = new FileInputStream("resources/stax/readWrite_in.xml");
                  InputStreamReader r = new InputStreamReader(in);
                  eventReader = factory_in.createXMLEventReader (r);
              } catch (XMLStreamException e) {
                  e.printStackTrace();
                  return;
              } catch (Exception e) {
                  //UnsupportedEncodingException or FileNotFoundException
                  e.printStackTrace();
                  return;
              }

              while(eventReader.hasNext()){

                  try {
                  event_in = eventReader.nextEvent();
                  } catch (XMLStreamException e) {
                      e.printStackTrace();
                  continue;
                  }

                  if(event_in.getEventType() == XMLStreamConstants.START_DOCUMENT) {
                      StartDocument aSD = (StartDocument) event_in;
                      System.out.println("even.isStartDocumentt: " + event_in.isStartDocument());
                      System.out.println("event.toString: " + event_in.toString());
                      //*************************************************************************
                      // encoding is not read
                      //*************************************************************************
                      System.out.println("StartDocument: encoding = " + aSD.getCharacterEncodingScheme());
                  }
              }
          }

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      No workaround found. I was looking at the openjdk code and issue seems to come from com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl at method
      public String getEncoding()

      ==> fEntityScanner.getEncoding(); retun null (I do not know exactly what it is supposed to return) but if the value return was from fScanner.getCharacterEncodingScheme() - then it would be correct ?


      Attachments

        Activity

          People

            joehw Joe Wang
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: