-
Enhancement
-
Resolution: Unresolved
-
P4
-
repo-leyden
The AOT cache contains an increasing number and variety of "assets", metadata (klass, profile), Java object graphs (module graph, integer box cache), and AOT code blobs (adapters, pre-init methods, peak methods).
As we pour more and more assets into the cache, we will certainly strain some size limits, and we will need to give users some control over the size budget of the cache, as well as over the kind and quantity of assets to insert into it.
At a minimum, we need a way to dump out a fine-grained accounting of what is in an AOT cache, to assist in rough budgeting of cache size.
Second, we should build a way to audit whether a given asset in the cache was actually helpful. Specifically, it should be possible, after a production run has had a chance to use the AOT cache, to print an accounting of which assets were used (when/why) and which assets have not been used.
When assets go unused, something is probably wrong, especially if there are many such assets (or they are by some measure "big" in the AOT cache). A report of such assets would make a good feedback into the engineering cycle, and might even be useful to an automatic policy, which would note events during a (future) training run that match unused assets in a (past) production run; the policy would refrain from taking those events into account when choosing assets for the AOT cache.
This assessment will be especially important if and when we give end users choices about actions taken during training runs or the assembly phase. If the user writes special logic to run at such times, we will want some way to make sure that logic has not overly perturbed the AOT cache. A report of unused assets might show assets for user-written logic (classes, code, objects) which applies only to the training run, and which the user knows is not helpful to production runs. After seeing a report of such assets, the user can then react appropriately, perhaps by manually excluding certain classes. See also this email exchange:
https://mail.openjdk.org/pipermail/leyden-dev/2024-October/001122.html
The foundation for all of this is first visibility of what is in the AOT cache, and secondly a distinction between which assets are used and which are not.
As we pour more and more assets into the cache, we will certainly strain some size limits, and we will need to give users some control over the size budget of the cache, as well as over the kind and quantity of assets to insert into it.
At a minimum, we need a way to dump out a fine-grained accounting of what is in an AOT cache, to assist in rough budgeting of cache size.
Second, we should build a way to audit whether a given asset in the cache was actually helpful. Specifically, it should be possible, after a production run has had a chance to use the AOT cache, to print an accounting of which assets were used (when/why) and which assets have not been used.
When assets go unused, something is probably wrong, especially if there are many such assets (or they are by some measure "big" in the AOT cache). A report of such assets would make a good feedback into the engineering cycle, and might even be useful to an automatic policy, which would note events during a (future) training run that match unused assets in a (past) production run; the policy would refrain from taking those events into account when choosing assets for the AOT cache.
This assessment will be especially important if and when we give end users choices about actions taken during training runs or the assembly phase. If the user writes special logic to run at such times, we will want some way to make sure that logic has not overly perturbed the AOT cache. A report of unused assets might show assets for user-written logic (classes, code, objects) which applies only to the training run, and which the user knows is not helpful to production runs. After seeing a report of such assets, the user can then react appropriately, perhaps by manually excluding certain classes. See also this email exchange:
https://mail.openjdk.org/pipermail/leyden-dev/2024-October/001122.html
The foundation for all of this is first visibility of what is in the AOT cache, and secondly a distinction between which assets are used and which are not.