-
Bug
-
Resolution: Unresolved
-
P4
-
None
-
1.4.2
-
x86
-
windows_xp
FULL PRODUCT VERSION :
java version "1.4.2_06"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_06-b03)
Java HotSpot(TM) Client VM (build 1.4.2_06-b03, mixed mode)
Also visible on 1.3.1_09.
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]
Also visible on other platforms.
A DESCRIPTION OF THE PROBLEM :
When parsing html using the ParserCallback and ParserDelegator classes, if there are multiple meta tags present in the head of the document, specified as a single entity, then only the first one is parsed correctly - passing a callback to the parser callback method handleSimple. Subsequent meta tags aren't processed if delimited as a single tag. Example (this is how most pages specify meta tags):
<html><head>
<meta name="description" content="one two three." />
<meta name="keywords" content="a b c" />
</head></html>
Only one meta is visibible and gets a correct callback to ParserCallback handleSimple.
However, if the html meta tags are changed to
<meta name="blah" content ="blah" >
then it works.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create a ParserCallback with a handleSimple method that is overridden, then output meta tag results, enumerating over them. Pass example html as below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expect to see multiple meta tags, but only see the first one.
ACTUAL -
Only get a callback to the first meta tag.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
No crash, just not visible.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
ParserCallback callback =
new ParserCallback ()
{
public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos)
{
if (t == HTML.Tag.META)
{
debug("META:");
Enumeration e = a.getAttributeNames();
while (e.hasMoreElements())
{
debug("\t" + e.nextElement());
}
}
}
}
};
ParserDelegator pd = new ParserDelegator();
pd.parse(reader, callback, true);
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None, write my own parser since I need all meta tags.
java version "1.4.2_06"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_06-b03)
Java HotSpot(TM) Client VM (build 1.4.2_06-b03, mixed mode)
Also visible on 1.3.1_09.
ADDITIONAL OS VERSION INFORMATION :
Microsoft Windows XP [Version 5.1.2600]
Also visible on other platforms.
A DESCRIPTION OF THE PROBLEM :
When parsing html using the ParserCallback and ParserDelegator classes, if there are multiple meta tags present in the head of the document, specified as a single entity, then only the first one is parsed correctly - passing a callback to the parser callback method handleSimple. Subsequent meta tags aren't processed if delimited as a single tag. Example (this is how most pages specify meta tags):
<html><head>
<meta name="description" content="one two three." />
<meta name="keywords" content="a b c" />
</head></html>
Only one meta is visibible and gets a correct callback to ParserCallback handleSimple.
However, if the html meta tags are changed to
<meta name="blah" content ="blah" >
then it works.
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Create a ParserCallback with a handleSimple method that is overridden, then output meta tag results, enumerating over them. Pass example html as below.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
Expect to see multiple meta tags, but only see the first one.
ACTUAL -
Only get a callback to the first meta tag.
ERROR MESSAGES/STACK TRACES THAT OCCUR :
No crash, just not visible.
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
ParserCallback callback =
new ParserCallback ()
{
public void handleSimpleTag(HTML.Tag t, MutableAttributeSet a, int pos)
{
if (t == HTML.Tag.META)
{
debug("META:");
Enumeration e = a.getAttributeNames();
while (e.hasMoreElements())
{
debug("\t" + e.nextElement());
}
}
}
}
};
ParserDelegator pd = new ParserDelegator();
pd.parse(reader, callback, true);
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None, write my own parser since I need all meta tags.