Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-6632959

swing html parser doesn't know € or ›

XMLWordPrintable

    • b97
    • generic, x86, sparc
    • generic, linux, solaris_10
    • Verified

      FULL PRODUCT VERSION :
      java version "1.7.0-ea"
      Java(TM) SE Runtime Environment (build 1.7.0-ea-b22)
      Java HotSpot(TM) 64-Bit Server VM (build 11.0-b08, mixed mode)


      ADDITIONAL OS VERSION INFORMATION :
      Linux lithium 2.6.22-14-generic #1 SMP Sun Oct 14 21:45:15 GMT 2007 x86_64 GNU/Linux


      A DESCRIPTION OF THE PROBLEM :
      the HTML of mails from amazon regularly contain › which Swing's DTD doesn't contain. here's a snippet (weird line breaking courtesy of the original, link mangled because i don't know what's encoded in the actual link and it's irrelevant anyway for the purposes of this bug):

      <tr align='left'><td valign='top'><strong><font color="#cc6600">&rsaquo;</font></strong>&nbsp;</td>
        <td width='100%'

      ><font size='2' face='Verdana, Arial, Helvetica, sans-serif'>
       <a href='http://www.sun.com/&#39;&gt;Cannery Row (Steinbeck "Essentials")</a></font>
        </td>
      </tr

      >

      it would be nice to have all of HTML4's character entity references, even if that's as much HTML4 support as we get in Java 7:

      http://www.w3.org/TR/html4/sgml/entities.html



      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      stick the attached source in "test.java" and then:

      javac test.java && java test

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      something that looks like "> hello". (it's not actually ">". it's '\u203a'. but it looks similar.)
      ACTUAL -
      something that looks like "&rsaquo; hello".

      REPRODUCIBILITY :
      This bug can be reproduced always.

      ---------- BEGIN SOURCE ----------
      import javax.swing.*;
      public class test extends JFrame {
       private JTextPane textPane;
       public test() {
        setContentPane(textPane = new JTextPane());
        textPane.setContentType("text/html");
        textPane.setText("<html><head></head><body>&rsquo; hello</body></html>");
        pack();
        setVisible(true);
       }
       public static void main(String[] args) {
        new test();
       }
      }

      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      code like this, with a line for each HTML4 entity you need:

                  DTD html32 = DTD.getDTD("html32");
                  html32.defEntity("rsaquo", DTDConstants.CDATA | DTDConstants.GENERAL, '\u203a');
                  html32.defEntity("lsaquo", DTDConstants.CDATA | DTDConstants.GENERAL, '\u2039');

      the only trick is that you *must* be sure Swing's set up the "html32" DTD first; it doesn't work if you create the "html32" DTD. i'm not sure of the best way to do that.

            peterz Peter Zhelezniakov
            ryeung Roger Yeung (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: