-
Bug
-
Resolution: Unresolved
-
P3
-
None
-
6
-
x86
-
linux
FULL PRODUCT VERSION :
1.6.0_02-b05
ADDITIONAL OS VERSION INFORMATION :
[root@s118 ~]# cat /etc/issue
Red Hat Enterprise Linux ES release 4 (Nahant Update 4) Kernel \r on an \m
[root@s118 ~]# uname -a
Linux s118.osi-tech.com 2.6.9-42.EL #1 Wed Jul 12 23:16:43 EDT 2006 i686
i686 i386 GNU/Linux
Microsoft Windows 2000 [Version 5.00.2195]
A DESCRIPTION OF THE PROBLEM :
If the url contains '&Nu' It is being converted into some greek letter 'N' even though there is no ; appended to it. This problem exists only in jdk.16 onwards It works fine with jdk1.5 or below versions.
The expected behaviour: While parsing the Url string using javax.swing.text.html.HtmlParser , It should not parse the letter '&Nu' unless there is a (;) semicolon appened to it
It worked fine with jdk1.5 and below version but when tried the same with jdk1.6 we had an Issue. Is there is any implementation changed w.r.t jdk1.6 while parsing the url string
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Step 1 : Paste and compile the below snippet of code in a file called
"TestUrl.java"
Step 2 : Copy and Paste the below mentioned snippet of code in a file called
"Url.html" file which contains the Url with
"&Nu" in C:\\Url.html file
Step 3 : Run the program and find that the ouput is some thing like this
: link ::::: http://www.sun.com?=srinivas
: Expected
: link ::::: http://www.sun.com&Nu=srinivas where &Nu
represents No of Units but not the greek letter 'N'
which implicitly
considers this as greek character "N" and parses it to the greek
letter even
though there is no ';' followed by "&Nu"
Note this works fine with jdk1.5 Problem exists in jdk1.6
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
: link ::::: http://www.sun.com&Nu=srinivas
ACTUAL -
: link ::::: http://www.sun.com?=srinivas
ERROR MESSAGES/STACK TRACES THAT OCCUR :
No Error Message
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
HTML CONTENT : Url.html
=======================
<HTML>
<BODY>
<a href="http://www.sun.com&Nu=srinivas">Sun Micro Systems </BODY> </HTML>
Source Code for TestUrl.java file
==================================
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
import java.net.URLConnection;
import java.util.ArrayList;
import java.util.List;
import javax.swing.text.BadLocationException;
import javax.swing.text.EditorKit;
import javax.swing.text.SimpleAttributeSet;
import javax.swing.text.html.HTML;
import javax.swing.text.html.HTMLDocument;
import javax.swing.text.html.HTMLEditorKit;
/*
* Created on Jul 1, 2007
*
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates */
/**
* @author snalla
*
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates */ public
class TestUrl {
static File file = null;
public static void main(String[] args) {
try{
// Make sure that Url.html file exists in C:\ to read the
content from this file
String urlfile="C://Url.html";
file = new File(urlfile);
URI uri=file.toURI();
String arr[] = getLinks(uri.toURL().toString());
}catch(Exception e){
e.printStackTrace();
}
}
public static String[] getLinks(String uriStr) {
List result = new ArrayList();
try {
// Create a reader on the HTML content
URL url = new URI(uriStr).toURL();
URLConnection conn = url.openConnection();
Reader rd = new InputStreamReader(conn.getInputStream());
// Parse the HTML
EditorKit kit = new HTMLEditorKit();
HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
doc.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
kit.read(rd, doc, 0);
// Find all the A elements in the HTML document
HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.A);
while (it.isValid()) {
SimpleAttributeSet s =
(SimpleAttributeSet)it.getAttributes();
String link = (String)s.getAttribute(HTML.Attribute.HREF);
if (link != null) {
// Add the link to the result list
System.out.println(": link ::::: " + link);
result.add(link);
}
it.next();
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (BadLocationException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
// Return all found links
return (String[])result.toArray(new String[result.size()]);
}
}
---------- END SOURCE ----------
Release Regression From : 5.0u12
The above release value was the last known release where this
bug was not reproducible. Since then there has been a regression.
1.6.0_02-b05
ADDITIONAL OS VERSION INFORMATION :
[root@s118 ~]# cat /etc/issue
Red Hat Enterprise Linux ES release 4 (Nahant Update 4) Kernel \r on an \m
[root@s118 ~]# uname -a
Linux s118.osi-tech.com 2.6.9-42.EL #1 Wed Jul 12 23:16:43 EDT 2006 i686
i686 i386 GNU/Linux
Microsoft Windows 2000 [Version 5.00.2195]
A DESCRIPTION OF THE PROBLEM :
If the url contains '&Nu' It is being converted into some greek letter 'N' even though there is no ; appended to it. This problem exists only in jdk.16 onwards It works fine with jdk1.5 or below versions.
The expected behaviour: While parsing the Url string using javax.swing.text.html.HtmlParser , It should not parse the letter '&Nu' unless there is a (;) semicolon appened to it
It worked fine with jdk1.5 and below version but when tried the same with jdk1.6 we had an Issue. Is there is any implementation changed w.r.t jdk1.6 while parsing the url string
STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
Step 1 : Paste and compile the below snippet of code in a file called
"TestUrl.java"
Step 2 : Copy and Paste the below mentioned snippet of code in a file called
"Url.html" file which contains the Url with
"&Nu" in C:\\Url.html file
Step 3 : Run the program and find that the ouput is some thing like this
: link ::::: http://www.sun.com?=srinivas
: Expected
: link ::::: http://www.sun.com&Nu=srinivas where &Nu
represents No of Units but not the greek letter 'N'
which implicitly
considers this as greek character "N" and parses it to the greek
letter even
though there is no ';' followed by "&Nu"
Note this works fine with jdk1.5 Problem exists in jdk1.6
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
: link ::::: http://www.sun.com&Nu=srinivas
ACTUAL -
: link ::::: http://www.sun.com?=srinivas
ERROR MESSAGES/STACK TRACES THAT OCCUR :
No Error Message
REPRODUCIBILITY :
This bug can be reproduced always.
---------- BEGIN SOURCE ----------
HTML CONTENT : Url.html
=======================
<HTML>
<BODY>
<a href="http://www.sun.com&Nu=srinivas">Sun Micro Systems </BODY> </HTML>
Source Code for TestUrl.java file
==================================
import java.io.File;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.Reader;
import java.net.MalformedURLException;
import java.net.URI;
import java.net.URISyntaxException;
import java.net.URL;
import java.net.URLConnection;
import java.util.ArrayList;
import java.util.List;
import javax.swing.text.BadLocationException;
import javax.swing.text.EditorKit;
import javax.swing.text.SimpleAttributeSet;
import javax.swing.text.html.HTML;
import javax.swing.text.html.HTMLDocument;
import javax.swing.text.html.HTMLEditorKit;
/*
* Created on Jul 1, 2007
*
* TODO To change the template for this generated file go to
* Window - Preferences - Java - Code Style - Code Templates */
/**
* @author snalla
*
* TODO To change the template for this generated type comment go to
* Window - Preferences - Java - Code Style - Code Templates */ public
class TestUrl {
static File file = null;
public static void main(String[] args) {
try{
// Make sure that Url.html file exists in C:\ to read the
content from this file
String urlfile="C://Url.html";
file = new File(urlfile);
URI uri=file.toURI();
String arr[] = getLinks(uri.toURL().toString());
}catch(Exception e){
e.printStackTrace();
}
}
public static String[] getLinks(String uriStr) {
List result = new ArrayList();
try {
// Create a reader on the HTML content
URL url = new URI(uriStr).toURL();
URLConnection conn = url.openConnection();
Reader rd = new InputStreamReader(conn.getInputStream());
// Parse the HTML
EditorKit kit = new HTMLEditorKit();
HTMLDocument doc = (HTMLDocument)kit.createDefaultDocument();
doc.putProperty("IgnoreCharsetDirective", Boolean.TRUE);
kit.read(rd, doc, 0);
// Find all the A elements in the HTML document
HTMLDocument.Iterator it = doc.getIterator(HTML.Tag.A);
while (it.isValid()) {
SimpleAttributeSet s =
(SimpleAttributeSet)it.getAttributes();
String link = (String)s.getAttribute(HTML.Attribute.HREF);
if (link != null) {
// Add the link to the result list
System.out.println(": link ::::: " + link);
result.add(link);
}
it.next();
}
} catch (MalformedURLException e) {
e.printStackTrace();
} catch (URISyntaxException e) {
e.printStackTrace();
} catch (BadLocationException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
// Return all found links
return (String[])result.toArray(new String[result.size()]);
}
}
---------- END SOURCE ----------
Release Regression From : 5.0u12
The above release value was the last known release where this
bug was not reproducible. Since then there has been a regression.