-
Bug
-
Resolution: Fixed
-
P4
-
None
-
b26
Issue | Fix Version | Assignee | Priority | Status | Resolution | Resolved In Build |
---|---|---|---|---|---|---|
JDK-8011822 | 8 | Ioi Lam | P4 | Closed | Fixed | b85 |
I found a bug in UTF8::as_quoted_ascii() that would cause random crashes on Win32 (see "HERE" below in the code):
void Symbol::print_symbol_on(outputStream* st) const {
st = st ? st : tty;
st->print("%s", as_quoted_ascii());
}
char* Symbol::as_quoted_ascii() const {
const char *ptr = (const char *)&_body[0];
int quoted_length = UTF8::quoted_ascii_length(ptr, utf8_length());
char* result = NEW_RESOURCE_ARRAY(char, quoted_length + 1);
UTF8::as_quoted_ascii(ptr, result, quoted_length + 1);
return result;
}
// converts a utf8 string to quoted ascii
void UTF8::as_quoted_ascii(const char* utf8_str, char* buf, int buflen) {
const char *ptr = utf8_str;
char* p = buf;
char* end = buf + buflen;
while (*ptr != '\0') { <<<<<<<<<<<<<<<<<<<<<<<<HERE
jchar c;
ptr = UTF8::next(ptr, &c);
if (c >= 32 && c < 127) {
if (p + 1 >= end) break; // string is truncated
*p++ = (char)c;
} else {
if (p + 6 >= end) break; // string is truncated
sprintf(p, "\\u%04x", c);
p += 6;
}
}
*p = '\0';
}
The (*ptr != '\0') check in UTF8::as_quoted_ascii assumes that it's OK to read one byte past the end of the utf8_str. This byte may not be zero, but the following (p+1 >= end) check would ensure the loop terminates.
However, after I fixed the Symbol::size() function (seeJDK-8009575), I have a case where
Symbol x = 0x00003ff0{
_length = 8;
_body[] = "activate" // &_body[0] = 0x00003ff8
};
x.size() == 16 bytes
The first byte after the end of "activate" is at 0x00004000, i.e., at a different page. The symbol is allocated by malloc. On Windows, the page immediately following the newly allocated space could be unmapped. So the (*ptr != '\0') line would crash with a faulting address of 0x00004000. I saw this in a JPRT crash dump.
The fix would be to add a utf8_length parameter to UTF8::as_quoted_ascii -- similar to the existing function UNICODE::as_quoted_ascii(const jchar* base, int length, char* buf, int buflen).
This bug could also crash the existing code (without fixed forJDK-8009575) if the first byte past the end of the string is a UTF8 lead byte, which would cause extra reading in UTF8::next(). We probably have never seen such a crash because (I am guessing)
(1) only Windows would have unmapped space immediately following a malloc'ed block.
(2) windows fills the malloc'ed block with zeros, so UTF8::next() reads only 1 byte.
void Symbol::print_symbol_on(outputStream* st) const {
st = st ? st : tty;
st->print("%s", as_quoted_ascii());
}
char* Symbol::as_quoted_ascii() const {
const char *ptr = (const char *)&_body[0];
int quoted_length = UTF8::quoted_ascii_length(ptr, utf8_length());
char* result = NEW_RESOURCE_ARRAY(char, quoted_length + 1);
UTF8::as_quoted_ascii(ptr, result, quoted_length + 1);
return result;
}
// converts a utf8 string to quoted ascii
void UTF8::as_quoted_ascii(const char* utf8_str, char* buf, int buflen) {
const char *ptr = utf8_str;
char* p = buf;
char* end = buf + buflen;
while (*ptr != '\0') { <<<<<<<<<<<<<<<<<<<<<<<<HERE
jchar c;
ptr = UTF8::next(ptr, &c);
if (c >= 32 && c < 127) {
if (p + 1 >= end) break; // string is truncated
*p++ = (char)c;
} else {
if (p + 6 >= end) break; // string is truncated
sprintf(p, "\\u%04x", c);
p += 6;
}
}
*p = '\0';
}
The (*ptr != '\0') check in UTF8::as_quoted_ascii assumes that it's OK to read one byte past the end of the utf8_str. This byte may not be zero, but the following (p+1 >= end) check would ensure the loop terminates.
However, after I fixed the Symbol::size() function (see
Symbol x = 0x00003ff0{
_length = 8;
_body[] = "activate" // &_body[0] = 0x00003ff8
};
x.size() == 16 bytes
The first byte after the end of "activate" is at 0x00004000, i.e., at a different page. The symbol is allocated by malloc. On Windows, the page immediately following the newly allocated space could be unmapped. So the (*ptr != '\0') line would crash with a faulting address of 0x00004000. I saw this in a JPRT crash dump.
The fix would be to add a utf8_length parameter to UTF8::as_quoted_ascii -- similar to the existing function UNICODE::as_quoted_ascii(const jchar* base, int length, char* buf, int buflen).
This bug could also crash the existing code (without fixed for
(1) only Windows would have unmapped space immediately following a malloc'ed block.
(2) windows fills the malloc'ed block with zeros, so UTF8::next() reads only 1 byte.
- backported by
-
JDK-8011822 Possible reading from unmapped memory in UTF8::as_quoted_ascii()
- Closed
- duplicates
-
JDK-8010097 JVM crash in VirtualMachine.attach()
- Closed
- relates to
-
JDK-8010097 JVM crash in VirtualMachine.attach()
- Closed
-
JDK-8009575 Reduce Symbol::_refcount from 4 bytes to 2 bytes
- Closed