Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4363200

JAXP 1.0.1 doesn't expand general entities (i.e. <)

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Fixed
    • Icon: P4 P4
    • 1.1
    • 1.3.0
    • xml
    • 1.1fcs
    • generic
    • generic



      Name: skT45625 Date: 08/16/2000


      java version "1.3.0rc2"
      Java(TM) 2 Runtime Environment, Standard Endition (build 1.3.0rc2-Y)
      Java Hotspot(TM) Client VM (build 1.3.0rc2-Y, mixed mode)

      I've just switched to using JAXP1.0.1 after using tr1 and tr2 for the last year
      or more. I've noticed a problem with the way some entities are expanded.
      I have some elements that contain text which has things like < and > in
      the normal text. I used to get one #TEXT node as a child with the entities
      expanded (to < and >). With JAXP, I get multiple #TEXT nodes with an
      EntityRefNode for each of the above entities included. That clearly is
      different behavior than I'm used to.

      Supporting documentation:

      Here is something from the DOM2 spec...

      Interface EntityReference

      EntityReference objects may be inserted into the structure model when an
      entity reference is in the source document, or when the user wishes to insert
      an entity reference. Note that character references and references to
      predefined entities are considered to be expanded by the HTML or XML
      processor so that characters are represented by their Unicode equivalent
      rather than by an entity reference. Moreover, the XML processor may
      completely expand references to entities while building the structure model,
      instead of providing EntityReference objects. If it does provide such objects,
      then for a given EntityReference node, it may be that there is no Entity
      node representing the referenced entity. If such an Entity exists, then the
      subtree of the EntityReference node is in general a copy of the Entity node
      subtree. However, this may not be true when an entity contains an unbound
      namespace prefix. In such a case, because the namespace prefix resolution
      depends on where the entity reference is, the descendants of the
      EntityReference node may be bound to different namespace URIs.


      Here is something from the XML1.0 spec...

      4.6 Predefined Entities

      Entity and character references can both be used to escape the left angle
      bracket, ampersand, and other delimiters. A set of general entities (amp,
      lt, gt, apos, quot) is specified for this purpose. Numeric character
      references may also be used; they are expanded immediately when recognized
      and must be treated as character data, so the numeric character references
      "&#60;" and "&#38;" may be used to escape < and & when they occur in
      character data. All XML processors must recognize these entities whether
      they are declared or not. For interoperability, valid XML documents should
      declare these entities, like any others, before using them. If the entities
      in question are declared, they must be declared as internal entities whose
      replacement text is the single character being escaped or a character reference
      to that character, as shown below.
                 <!ENTITY lt "&#38;#60;">
                 <!ENTITY gt "&#62;">
                 <!ENTITY amp "&#38;#38;">
                 <!ENTITY apos "&#39;">
                 <!ENTITY quot "&#34;">

      Note that the < and & characters in the declarations of "lt" and "amp" are
      doubly escaped to meet the requirement that entity replacement be well-formed.
      (Review ID: 108520)
      ======================================================================

            egoei Edwin Goei (Inactive)
            skondamasunw Suresh Kondamareddy (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: