Based on suggestions from [~eosterlund] during the review of JDK-8310160
[1] Allocate an OopStorage with the same number of slots as the number of individual CDS archived objects.
[2] Allocate a fake "scratch" object for every CDS archived object. These can be byte arrays or java.lang.Object
[3] Copy each archived object into its scratch space
[4] Fix up the pointers inside the scratch spaces
Theoretically, GC can happen during [2], and the scratch spaces may end up in arbitrary locations, which would make [4] slow. However, in most cases, the scratch spaces would be one or a small number of contiguous blocks. Within each block, the order of the objects are the same as their order in the CDS archive.
- If we have a single contiguous, ordered block, the pointer fixing can be done with a fast path (same performance as the current implementation)
- If we have a small number of contiguous, ordered blocks, we can speculate the object being fixed and the pointer it points at are in the same block. If the pointer points to a different block, it can be fixed with a quick lookup. We had an implementation that's optimized for up to 4 such blocks. See
https://github.com/openjdk/jdk/blob/65442a2e26afa7c31b5949e7e20606e4066ced3b/src/hotspot/share/cds/archiveHeapLoader.cpp#L270-L324
- Otherwise, the pointer patching needs to be using a hashtable look up. This should be very rare.
======================================
The advantage of this proposal is that we don't need to have CDS-specific code in the GCs anymore.
The disadvantage is performance may be slower. This could be more problematic in the future for Project Leyden as we expect more objects to be archived. (The current size is about 1MB so perhaps not a big deal).
We might be able to archive the same performance with small tuning of the scratch object allocation:
- CDS tell GC that it's allocating scratch objects, with the requested based address and total size
- Some GCs may be able to honor the request so that the allocated scratch objects are exactly where CDS wants them to be. In this case, the archive heap can be mmaped with no relocation
- Or, the GC could ensure that the scratch objects are in a single contiguous, ordered block. This allows optimized relocation.
[1] Allocate an OopStorage with the same number of slots as the number of individual CDS archived objects.
[2] Allocate a fake "scratch" object for every CDS archived object. These can be byte arrays or java.lang.Object
[3] Copy each archived object into its scratch space
[4] Fix up the pointers inside the scratch spaces
Theoretically, GC can happen during [2], and the scratch spaces may end up in arbitrary locations, which would make [4] slow. However, in most cases, the scratch spaces would be one or a small number of contiguous blocks. Within each block, the order of the objects are the same as their order in the CDS archive.
- If we have a single contiguous, ordered block, the pointer fixing can be done with a fast path (same performance as the current implementation)
- If we have a small number of contiguous, ordered blocks, we can speculate the object being fixed and the pointer it points at are in the same block. If the pointer points to a different block, it can be fixed with a quick lookup. We had an implementation that's optimized for up to 4 such blocks. See
https://github.com/openjdk/jdk/blob/65442a2e26afa7c31b5949e7e20606e4066ced3b/src/hotspot/share/cds/archiveHeapLoader.cpp#L270-L324
- Otherwise, the pointer patching needs to be using a hashtable look up. This should be very rare.
======================================
The advantage of this proposal is that we don't need to have CDS-specific code in the GCs anymore.
The disadvantage is performance may be slower. This could be more problematic in the future for Project Leyden as we expect more objects to be archived. (The current size is about 1MB so perhaps not a big deal).
We might be able to archive the same performance with small tuning of the scratch object allocation:
- CDS tell GC that it's allocating scratch objects, with the requested based address and total size
- Some GCs may be able to honor the request so that the allocated scratch objects are exactly where CDS wants them to be. In this case, the archive heap can be mmaped with no relocation
- Or, the GC could ensure that the scratch objects are in a single contiguous, ordered block. This allows optimized relocation.
- duplicates
-
JDK-8326035 Ahead-of-Time GC Agnostic Object Archiving
- Submitted
- relates to
-
JDK-8310160 Make GC APIs for handling archive heap objects agnostic of GC policy
- Closed
-
JDK-8311604 Simplify NOCOOPS requested addresses for archived heap objects
- Resolved
-
JDK-8296263 Uniform APIs for mapping archived heap regions
- Closed
-
JDK-8251330 Reorder CDS archived heap to speed up relocation
- Resolved
-
JDK-8313224 Avoid calling JavaThread::current() in MemAllocator::Allocation constructor
- Resolved
(1 relates to)