-
Bug
-
Resolution: Fixed
-
P3
-
1.1.4
-
1.2beta4
-
x86
-
windows_nt
-
Not verified
==========================================================================
carlos.lucasius@canada 1998-03-13:
Bug reported by Corel (licensee) for JFC1.1 using JDK1.1.4 on WINNT4.0.
Corel: "Very critical bug for our product - requires immediate attention."
Spaces between two different formatted pieces of text are discarded
when loading. That is, the following HTML code:
<b>bold</b> <u>underline</u>
will be loaded like this:
boldunderline
==========================================================================
carlos.lucasius@canada 1998-04-08:
Further to a conference call including Rick Levenson, Hans Muller, myself,
and Corel on April 6, 1998, the same source (Corel) has requested this bug
to have the following information/fix added to its description:
I agree that the reasoning used by the parser when loading the HTML
documents is correct.
ie: Multiple whitespace is replaced by a single space and
whitespace between tags is stripped.
The problem with this however is the way the HTMLWriter currently
outputs whitespace and the way tabs are inserted into the document.
To my knowledge, the tab character is not in any way recognized by
the HTML spec. I think the first step is to change the way tabs are
handled by the HTMLEditorKit. I think (correct me if I'm wrong) the
InsertTabAction as defined in DefaultEditorKit should be overriden/replaced
in the HTMLEditorKit. This implementation should instead insert a fixed
number of spaces instead of a tab character. With this solution, it would
also be nice to have get/set methods for the number of spaces to use when
inserting the tab. This would allow us to change the value according to
user preferences.
The second part to the fix involves the HTMLWriter and how it deals
with whitespace. Included is a fix which I think handles things
correctly. I replaced the 'writeText' method with a more detailed version.
It also parses the output string one character at a time but takes special
actions with spaces. When it encounters a space, it counts how many
consecutive spaces there are. It then does the following:
If there are 2 or more spaces, then they are all output as
If there is 1 space and it located next to a tag start/end, then
is output.
If there is 1 space and it located between similar content, then it
is output as a char.
I hope you find this solution appropriate. (I didn't include the
overridden InsertTabAction since it should be simple) I replaced:
private void writeText(Writer w, String data) {
int len = data.length();
for (int i = 0; i < len; i++) {
writeChar(w, data.charAt(i));
}
}
with:
/**
* Writes the string out to the given writer.
* Scan the string and replaces blanks with character entities
* when appropriate.
*/
private void writeText(Writer w, String data)
{
// cycle through all the characters
int nLength = data.length();
for (int index = 0 ; index < nLength ; index++)
{
int nSpaces = 0;
char nextChar = data.charAt(index);
try
{
// if we encounter a space, count how many consecutive ones there are
if (nextChar == ' ' || (int)nextChar == 160)
{
do
{
nSpaces++;
if (index + nSpaces < nLength)
{
nextChar = data.charAt(index + nSpaces);
}
}
while ((index + nSpaces < nLength) &&
(nextChar == ' ' || (int)nextChar == 160));
// if only one space with data before and after it, write out a
// normal space
if (nSpaces == 1 && index > 0 && (index + nSpaces) < nLength)
{
w.write(' ');
}
// otherwise convert all spaces to character entities
else
{
for (int i = nSpaces ; i > 0 ; i--)
{
w.write(" ");
}
}
// continue checking each character but skip over the blanks
// we just parsed
index += nSpaces - 1;
}
// just write out the character
else
{
writeChar(w, nextChar);
}
}
catch (IOException e)
{
System.out.println("HTMLWriter.write: " + nextChar + " " + e);
}
}
}
carlos.lucasius@canada 1998-03-13:
Bug reported by Corel (licensee) for JFC1.1 using JDK1.1.4 on WINNT4.0.
Corel: "Very critical bug for our product - requires immediate attention."
Spaces between two different formatted pieces of text are discarded
when loading. That is, the following HTML code:
<b>bold</b> <u>underline</u>
will be loaded like this:
boldunderline
==========================================================================
carlos.lucasius@canada 1998-04-08:
Further to a conference call including Rick Levenson, Hans Muller, myself,
and Corel on April 6, 1998, the same source (Corel) has requested this bug
to have the following information/fix added to its description:
I agree that the reasoning used by the parser when loading the HTML
documents is correct.
ie: Multiple whitespace is replaced by a single space and
whitespace between tags is stripped.
The problem with this however is the way the HTMLWriter currently
outputs whitespace and the way tabs are inserted into the document.
To my knowledge, the tab character is not in any way recognized by
the HTML spec. I think the first step is to change the way tabs are
handled by the HTMLEditorKit. I think (correct me if I'm wrong) the
InsertTabAction as defined in DefaultEditorKit should be overriden/replaced
in the HTMLEditorKit. This implementation should instead insert a fixed
number of spaces instead of a tab character. With this solution, it would
also be nice to have get/set methods for the number of spaces to use when
inserting the tab. This would allow us to change the value according to
user preferences.
The second part to the fix involves the HTMLWriter and how it deals
with whitespace. Included is a fix which I think handles things
correctly. I replaced the 'writeText' method with a more detailed version.
It also parses the output string one character at a time but takes special
actions with spaces. When it encounters a space, it counts how many
consecutive spaces there are. It then does the following:
If there are 2 or more spaces, then they are all output as
If there is 1 space and it located next to a tag start/end, then
is output.
If there is 1 space and it located between similar content, then it
is output as a char.
I hope you find this solution appropriate. (I didn't include the
overridden InsertTabAction since it should be simple) I replaced:
private void writeText(Writer w, String data) {
int len = data.length();
for (int i = 0; i < len; i++) {
writeChar(w, data.charAt(i));
}
}
with:
/**
* Writes the string out to the given writer.
* Scan the string and replaces blanks with character entities
* when appropriate.
*/
private void writeText(Writer w, String data)
{
// cycle through all the characters
int nLength = data.length();
for (int index = 0 ; index < nLength ; index++)
{
int nSpaces = 0;
char nextChar = data.charAt(index);
try
{
// if we encounter a space, count how many consecutive ones there are
if (nextChar == ' ' || (int)nextChar == 160)
{
do
{
nSpaces++;
if (index + nSpaces < nLength)
{
nextChar = data.charAt(index + nSpaces);
}
}
while ((index + nSpaces < nLength) &&
(nextChar == ' ' || (int)nextChar == 160));
// if only one space with data before and after it, write out a
// normal space
if (nSpaces == 1 && index > 0 && (index + nSpaces) < nLength)
{
w.write(' ');
}
// otherwise convert all spaces to character entities
else
{
for (int i = nSpaces ; i > 0 ; i--)
{
w.write(" ");
}
}
// continue checking each character but skip over the blanks
// we just parsed
index += nSpaces - 1;
}
// just write out the character
else
{
writeChar(w, nextChar);
}
}
catch (IOException e)
{
System.out.println("HTMLWriter.write: " + nextChar + " " + e);
}
}
}
- relates to
-
JDK-4102894 The HTMLEditorKit's read() method doesn't handle the <BR> tag
-
- Closed
-