A DESCRIPTION OF THE REQUEST :
We are using the 64-bit JVM with very large heaps - typically around 50 GBytes. During object serialization, we have recently started encountering a new class of error never previously seen before:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
exception.
This appears to be caused by the following:
ObjectOutputStream.java creates an instance of Object[] to hold data in its HandleTable class - which is a type of HashTable. The initial size is 10, but when the array's capacity is exhausted a new array is created which is of size [2n+1] where n was the old array capacity.
Due to this algorithm, the array size grows like this:
10
21
43
87
175
351
703
1407
2815
5631
11263
22527
45055
90111
180223
360447
720895
1441791
2883583
5767167
11534335
23068671
46137343
92274687
184549375
369098751
The maximum permitted number of elements for an instance of Object[] is 268,435,456. So when the serialization code tries to double the capacity to 369,098,751 it crashes with an OutOfMemoryError, which is the default response to an attempt to create an array with too many elements.
More generally, the whole collections framework is affected by this limit in what feels like a rather arbitrary manner.
For example, the default capacity of an ArrayList is 10, and the algorithm is to simply double the capacity when the current array is full. This means that a default ArrayList will crash with an OutOfMemoryError when the total number of elements gets to 167,772,160. However if the capacity is initially set to 15 then the OutOfMemoryError happens when the total number of elements gets to 251,658,240.
All of this seems very arbitrary and is not well documented anywhere to the best of my knowledge.
JUSTIFICATION :
We'd like to be able to serialize arbitrarily large object graphs.
I'm submitting this as an RFE rather than a bug because it all appears to be caused by a design choice in the implementation of hotspot, which is that no object can be more than 2**31 bytes long, even in a 64-bit JVM.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I'd like serialization of a Vector or other collection to happen without an OutOfMemoryError caused by a hotspot restriction mixing with an internal algorithmic detail of the implementation of serialization.
ACTUAL -
Serialization crashes with OutOfMemoryError.
---------- BEGIN SOURCE ----------
Try compiling this bit of code. Then run it on a 64-bit JVM with lots of heap.
java -Xmx10g -Xms10g Serial 185000000
import java.io.OutputStream;
import java.io.ObjectOutputStream;
import java.util.Vector;
class MyOutputStream extends OutputStream {
public MyOutputStream() {}
public void close() {}
public void flush() {}
public void write(byte[] b) {}
public void write(byte[] b, int off, int len) {}
public void write(int b) {}
}
public class Serial extends Thread {
public static void main(String[] args) throws Exception {
int length = Integer.parseInt(args[0]);
MyOutputStream mos = new MyOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(mos);
Vector vector = new Vector(15);
System.out.println("CREATING VECTOR");
for (int i=0; i<length; i++) {
if (i%10000000==0)
System.out.println(i);
vector.add(new Integer(i));
}
System.out.println("CREATED VECTOR");
long startTime = System.currentTimeMillis();
System.out.println("Beginning serialization");
oos.writeObject(vector);
long endTime = System.currentTimeMillis();
System.out.println("Elapsed time = " + (endTime-startTime) + " msecs");
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None. Can't serialize the object graph. We need to find a mechanism for dividing the object graph into n sections where each section has fewer than 184,500,000 unique objects. I'm not quite sure how to do that...
We are using the 64-bit JVM with very large heaps - typically around 50 GBytes. During object serialization, we have recently started encountering a new class of error never previously seen before:
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
exception.
This appears to be caused by the following:
ObjectOutputStream.java creates an instance of Object[] to hold data in its HandleTable class - which is a type of HashTable. The initial size is 10, but when the array's capacity is exhausted a new array is created which is of size [2n+1] where n was the old array capacity.
Due to this algorithm, the array size grows like this:
10
21
43
87
175
351
703
1407
2815
5631
11263
22527
45055
90111
180223
360447
720895
1441791
2883583
5767167
11534335
23068671
46137343
92274687
184549375
369098751
The maximum permitted number of elements for an instance of Object[] is 268,435,456. So when the serialization code tries to double the capacity to 369,098,751 it crashes with an OutOfMemoryError, which is the default response to an attempt to create an array with too many elements.
More generally, the whole collections framework is affected by this limit in what feels like a rather arbitrary manner.
For example, the default capacity of an ArrayList is 10, and the algorithm is to simply double the capacity when the current array is full. This means that a default ArrayList will crash with an OutOfMemoryError when the total number of elements gets to 167,772,160. However if the capacity is initially set to 15 then the OutOfMemoryError happens when the total number of elements gets to 251,658,240.
All of this seems very arbitrary and is not well documented anywhere to the best of my knowledge.
JUSTIFICATION :
We'd like to be able to serialize arbitrarily large object graphs.
I'm submitting this as an RFE rather than a bug because it all appears to be caused by a design choice in the implementation of hotspot, which is that no object can be more than 2**31 bytes long, even in a 64-bit JVM.
EXPECTED VERSUS ACTUAL BEHAVIOR :
EXPECTED -
I'd like serialization of a Vector or other collection to happen without an OutOfMemoryError caused by a hotspot restriction mixing with an internal algorithmic detail of the implementation of serialization.
ACTUAL -
Serialization crashes with OutOfMemoryError.
---------- BEGIN SOURCE ----------
Try compiling this bit of code. Then run it on a 64-bit JVM with lots of heap.
java -Xmx10g -Xms10g Serial 185000000
import java.io.OutputStream;
import java.io.ObjectOutputStream;
import java.util.Vector;
class MyOutputStream extends OutputStream {
public MyOutputStream() {}
public void close() {}
public void flush() {}
public void write(byte[] b) {}
public void write(byte[] b, int off, int len) {}
public void write(int b) {}
}
public class Serial extends Thread {
public static void main(String[] args) throws Exception {
int length = Integer.parseInt(args[0]);
MyOutputStream mos = new MyOutputStream();
ObjectOutputStream oos = new ObjectOutputStream(mos);
Vector vector = new Vector(15);
System.out.println("CREATING VECTOR");
for (int i=0; i<length; i++) {
if (i%10000000==0)
System.out.println(i);
vector.add(new Integer(i));
}
System.out.println("CREATED VECTOR");
long startTime = System.currentTimeMillis();
System.out.println("Beginning serialization");
oos.writeObject(vector);
long endTime = System.currentTimeMillis();
System.out.println("Elapsed time = " + (endTime-startTime) + " msecs");
}
}
---------- END SOURCE ----------
CUSTOMER SUBMITTED WORKAROUND :
None. Can't serialize the object graph. We need to find a mechanism for dividing the object graph into n sections where each section has fewer than 184,500,000 unique objects. I'm not quite sure how to do that...
- duplicates
-
JDK-5089202 cannot allocate a java.lang.StringBuffer object with size of 1GB
-
- Closed
-
- relates to
-
JDK-6464834 ObjectOutputStream's internal array management limits maximum size
-
- Open
-
-
JDK-4880587 64-Bit indexing for arrays
-
- Closed
-