-
Bug
-
Resolution: Fixed
-
P3
-
1.4.2, 5.0
-
b11
-
x86
-
windows_2000, windows_xp
Name: js151677 Date: 07/30/2004
FULL PRODUCT VERSION :
java version "1.5.0-beta2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta2-b51)
Java HotSpot(TM) Client VM (build 1.5.0-beta2-b51, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
Swing's HTML parser javax.swing.text.html.parser.Parser incorrectly converts the attribute values for the HTML "class" attributes to lower case. This is new and incorrect behavior with 1.5.0.
The HTML "class" attribute is case-sensitive, so this conversion to lower case is clearly incorrect. See
http://www.w3.org/TR/html401/struct/global.html#adef-class
(section 7.5.2 of the HTML 4.01 spec)
for the definitive reference that states that "class" is case sensitive. (it's what the "[CS]" means in the spec).
The code that was introduced in 1.5.0 that causes this is in javax/swing/text/html/parser/Parser.java. Look at method parseAttributeSpecificationList(Element elem). There's a new (to 1.5.0) fragment near the end of this method:
if (attkey == HTML.Attribute.CLASS) {
attvalue = attvalue.toLowerCase();
}
So, it looks like this was intentional (but why???)
This is causing us grief in our application (which uses the HTML parser as part of some dynamic HTML generation), as much of the CSS style matching is based on class names, and the matching in the HTML renderer (Internet Explorer or whatever) is case sensitive, as it should be based on the HTML specs. We have lots of already-developed HTML and HTML fragments and associated stylesheets which makes the workaround (using only lower case HTML class attribute values) impractical.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
To demonstrate that this behavior has changed:
1. Compile the provided executable test case.
2. Run it under 1.4.2_xx and you'll get the expected (and correct) result.
3. Run it under 1.5.0b2 and you'll get a result indicating that the attribute value for "class" has been converted to lower case.
The HTML being parsed by the sample code is:
<html>
<head>
<title>Example HTML containing some class attributes</title>
<style type="text/css">
.bigText { font-size: 16pt; }
</style>
</head>
<body>
<p class="bigText">This text should be big.</p>
</body>
</html>
(note in particular the element with the class="bigText" attribute definition)
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
When you run the above-provided example (class HtmlParserProblem) under 1.4.2_xx, you'll get this:
start tag: html
attributes:
start tag: head
attributes:
start tag: title
attributes:
end tag: title
start tag: style
attributes: type=text/css
end tag: style
end tag: head
start tag: body
attributes: type=text/css
start tag: p
attributes: type=text/css class=bigText
end tag: p
end tag: body
end tag: html
This is correct; note the correct mixed-case value "bigText" for the class attribute.
ACTUAL -
When you run the above-provided example (class HtmlParserProblem) using Java 1.5.0-b2, you'll get this:
start tag: html
attributes:
start tag: head
attributes:
start tag: title
attributes:
end tag: title
start tag: style
attributes: type=text/css
end tag: style
end tag: head
start tag: body
attributes: type=text/css
start tag: p
attributes: class=bigtext type=text/css
end tag: p
end tag: body
end tag: html
Note that the class attribute value has been converted to lower case. This is a Bad Thing.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public class HtmlParserProblem
{
public static void main(final String[] args) throws java.io.IOException
{
new Html32Parser().parse(new java.io.StringReader(getExampleHtml()));
}
public static class Html32Parser extends javax.swing.text.html.parser.Parser
{
public Html32Parser() throws java.io.IOException
{
super(loadDtd("html32"));
this.strict = false;
}
public void handleStartTag(final javax.swing.text.html.parser.TagElement tagElement)
{
System.out.println("start tag: " + tagElement.getHTMLTag().toString());
final javax.swing.text.SimpleAttributeSet attributes = getAttributes();
System.out.println("attributes: " + attributes.toString());
}
public void handleEndTag(final javax.swing.text.html.parser.TagElement tagElement)
{
System.out.println("end tag: " + tagElement.getHTMLTag().toString());
}
}
private static javax.swing.text.html.parser.DTD loadDtd(final String dtdName)
throws java.io.IOException
{
final String resourceName = dtdName + ".bdtd";
final java.io.InputStream inputStream =
javax.swing.text.html.parser.DTD.class.getResourceAsStream(resourceName);
if (inputStream == null) throw new java.io.IOException(resourceName);
final javax.swing.text.html.parser.DTD dtd =
javax.swing.text.html.parser.DTD.getDTD(dtdName);
dtd.read(new java.io.DataInputStream(inputStream));
return dtd;
}
private static String getExampleHtml()
{
return
"<html>\n" +
" <head>\n" +
" <title>Example HTML containing some class attributes</title>\n" +
" <style type='text/css'>\n" +
" .bigText { font-size: 16pt; }\n" +
" </style>\n" +
" </head>\n" +
" <body>\n" +
" <p class='bigText'>This text should be big.</p>\n" +
" </body>\n" +
"</html>";
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Only use lower-case values for "class" attributes.
Release Regression From : 1.4.2_04
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
(Incident Review ID: 290362)
======================================================================
###@###.### 10/18/04 23:22 GMT
FULL PRODUCT VERSION :
java version "1.5.0-beta2"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0-beta2-b51)
Java HotSpot(TM) Client VM (build 1.5.0-beta2-b51, mixed mode, sharing)
A DESCRIPTION OF THE PROBLEM :
Swing's HTML parser javax.swing.text.html.parser.Parser incorrectly converts the attribute values for the HTML "class" attributes to lower case. This is new and incorrect behavior with 1.5.0.
The HTML "class" attribute is case-sensitive, so this conversion to lower case is clearly incorrect. See
http://www.w3.org/TR/html401/struct/global.html#adef-class
(section 7.5.2 of the HTML 4.01 spec)
for the definitive reference that states that "class" is case sensitive. (it's what the "[CS]" means in the spec).
The code that was introduced in 1.5.0 that causes this is in javax/swing/text/html/parser/Parser.java. Look at method parseAttributeSpecificationList(Element elem). There's a new (to 1.5.0) fragment near the end of this method:
if (attkey == HTML.Attribute.CLASS) {
attvalue = attvalue.toLowerCase();
}
So, it looks like this was intentional (but why???)
This is causing us grief in our application (which uses the HTML parser as part of some dynamic HTML generation), as much of the CSS style matching is based on class names, and the matching in the HTML renderer (Internet Explorer or whatever) is case sensitive, as it should be based on the HTML specs. We have lots of already-developed HTML and HTML fragments and associated stylesheets which makes the workaround (using only lower case HTML class attribute values) impractical.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
To demonstrate that this behavior has changed:
1. Compile the provided executable test case.
2. Run it under 1.4.2_xx and you'll get the expected (and correct) result.
3. Run it under 1.5.0b2 and you'll get a result indicating that the attribute value for "class" has been converted to lower case.
The HTML being parsed by the sample code is:
<html>
<head>
<title>Example HTML containing some class attributes</title>
<style type="text/css">
.bigText { font-size: 16pt; }
</style>
</head>
<body>
<p class="bigText">This text should be big.</p>
</body>
</html>
(note in particular the element with the class="bigText" attribute definition)
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
When you run the above-provided example (class HtmlParserProblem) under 1.4.2_xx, you'll get this:
start tag: html
attributes:
start tag: head
attributes:
start tag: title
attributes:
end tag: title
start tag: style
attributes: type=text/css
end tag: style
end tag: head
start tag: body
attributes: type=text/css
start tag: p
attributes: type=text/css class=bigText
end tag: p
end tag: body
end tag: html
This is correct; note the correct mixed-case value "bigText" for the class attribute.
ACTUAL -
When you run the above-provided example (class HtmlParserProblem) using Java 1.5.0-b2, you'll get this:
start tag: html
attributes:
start tag: head
attributes:
start tag: title
attributes:
end tag: title
start tag: style
attributes: type=text/css
end tag: style
end tag: head
start tag: body
attributes: type=text/css
start tag: p
attributes: class=bigtext type=text/css
end tag: p
end tag: body
end tag: html
Note that the class attribute value has been converted to lower case. This is a Bad Thing.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
public class HtmlParserProblem
{
public static void main(final String[] args) throws java.io.IOException
{
new Html32Parser().parse(new java.io.StringReader(getExampleHtml()));
}
public static class Html32Parser extends javax.swing.text.html.parser.Parser
{
public Html32Parser() throws java.io.IOException
{
super(loadDtd("html32"));
this.strict = false;
}
public void handleStartTag(final javax.swing.text.html.parser.TagElement tagElement)
{
System.out.println("start tag: " + tagElement.getHTMLTag().toString());
final javax.swing.text.SimpleAttributeSet attributes = getAttributes();
System.out.println("attributes: " + attributes.toString());
}
public void handleEndTag(final javax.swing.text.html.parser.TagElement tagElement)
{
System.out.println("end tag: " + tagElement.getHTMLTag().toString());
}
}
private static javax.swing.text.html.parser.DTD loadDtd(final String dtdName)
throws java.io.IOException
{
final String resourceName = dtdName + ".bdtd";
final java.io.InputStream inputStream =
javax.swing.text.html.parser.DTD.class.getResourceAsStream(resourceName);
if (inputStream == null) throw new java.io.IOException(resourceName);
final javax.swing.text.html.parser.DTD dtd =
javax.swing.text.html.parser.DTD.getDTD(dtdName);
dtd.read(new java.io.DataInputStream(inputStream));
return dtd;
}
private static String getExampleHtml()
{
return
"<html>\n" +
" <head>\n" +
" <title>Example HTML containing some class attributes</title>\n" +
" <style type='text/css'>\n" +
" .bigText { font-size: 16pt; }\n" +
" </style>\n" +
" </head>\n" +
" <body>\n" +
" <p class='bigText'>This text should be big.</p>\n" +
" </body>\n" +
"</html>";
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
Only use lower-case values for "class" attributes.
Release Regression From : 1.4.2_04
The above release value was the last known release where this
bug was known to work. Since then there has been a regression.
(Incident Review ID: 290362)
======================================================================
###@###.### 10/18/04 23:22 GMT
- duplicates
-
JDK-5082482 The Swing HTMLEditorKit should be case-insensitive with CSS class names
-
- Closed
-
- relates to
-
JDK-4674744 1.4.0 REGRESSION: HTML CLASS tag ignored using HTMLEditorKit
-
- Resolved
-