Make G1 mixed collections abortable if they might exceed the pause target.
Make all pauses in G1 abortable.
One of the goals of G1 is to meet a user supplied pause time target for its collection pauses. G1 uses an advanced analysis engine to select the amount of work to be done during a collection (this is partly based on application behavior). The result of this selection is a set of regions called the collection set. Once the collection set has been determined and the collection has been started then G1 must collect all live objects in all regions of the collection set without stopping. This behavior can lead to G1 exceeding the pause time goal if the heuristics choose a too-large collection set, which for example can happen if the application’s behavior changes such that the heuristics work on "stale" data. This can in particular be observed during mixed collections, where the collection set can often contain too many old regions. There is need for a mechanism that detects when the heuristics repeatedly select a wrong amount of work for collections, and if so, have G1 perform the collection work incrementally in steps, where the collection can be aborted after each step. Such a mechanism would allow G1 to meet the pause time goal more often.
If G1 discovers that the collection set selection heuristics repeatedly select the wrong number of regions, switch to a more incremental way of doing mixed collections: split the collection set into two parts, a mandatory and an optional part. The mandatory part comprises parts of the collection set that G1 cannot process incrementally (e.g., the young regions) but can also contain old regions for improved efficiency. This may, e.g., be 80% of the predicted collection set. The remaining 20% of the predicted collection set, which would consist of only old regions, then forms the optional part.
After G1 finishes collecting the mandatory part, G1 starts collecting the optional part at a much more granular level, if there is time left. The granularity of collection of this optional part depends on the amount of time left, at most down to one region at a time. After completing collection of any part of the optional collection set, G1 can decide to stop the collection depending on the remaining time.
As the predictions get more accurate again, the optional part of a collection are made smaller and smaller, until the mandatory part once again comprises all of the collection set (i.e., G1 completely relies on its heuristics). If the predictions becomes inaccurate again, then the next collections will consist of both a mandatory and optional part again.
Improve the analytics engine and heuristics so that they don't make bad predictions. Although an interesting goal in itself, given that the heuristics depend on previous application behavior, it is impossible to get 100% accurate heuristics. However, improved heuristics will automatically decrease the need for this mechanism.
Always use a "safety margin" with regards to the prediction. For example, if the prediction returns x, always use 0.8 * x (a 20% safety margin). This would probably work in most cases, but will result in sub-optimal performance when the predictions do work as this means that G1 would only utilize 80% of the pause target.
Reuse the existing evacuation failure mechanism to abort mixed collections. This has been rejected as an alternative because at the time of such an abort there is no guarantee that the collection freed any region. The mechanism we propose guarantees space reclamation progress by reclaiming the space of the collection set on a region basis, which is G1's granularity of space reclamation.
The individual C++ parts making up the implementation should be tested using C++ unit tests. Since the abortable mixed collections code would be an integral part of the G1 GC, the code will be exercised by just running the existing tests.
Risks and Assumptions
When splitting the collection set into mandatory and optional parts, some additional data needs to be maintained for the optional collection set part. This imposes a slight CPU overhead, smaller than 1% and only for mixed collections using the optional collection set part.
The native memory usage during mixed collections using an optional collection set part may also increase. This is because some additional incoming pointers to the regions in the optional part must be tracked during that mixed collection.