Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8360935

G1: Time Based Heap Uncommit During Idle Periods

XMLWordPrintable

    • Monica Beckwith
    • Feature
    • Open
    • gc
    • Implementation
    • hotspot dash gc dash dev at openjdk dot java dot net
    • M
    • S

      Summary

      Introduce time-based heap sizing for the G1 garbage collector to automatically uncommit unused memory regions during application idle periods, independent of garbage collection frequency.

      Goals

      • Enable G1 to uncommit memory from inactive regions without requiring garbage collections
      • Provide configurable timing parameters for memory uncommit decisions
      • Maintain existing G1 performance characteristics and pause time goals
      • Improve memory efficiency for applications with infrequent garbage collection cycles and extended idle periods

      Non-Goals

      • This JEP does not implement time-based heap expansion
      • This JEP does not modify existing G1 expansion heuristics or pause-time based sizing
      • This JEP does not change G1's collection scheduling or concurrent marking behavior

      Success Metrics

      G1 should automatically release unused Java heap memory during application idle periods without requiring garbage collection activity. Success is measured by:

      • Committed memory reduction of 15-45% during idle periods (validated through SPECjbb2015 testing)
      • No measurable impact on application throughput (<2% variation within measurement noise)
      • Zero increase in GC pause times during evaluation periods
      • Successful operation across all supported platforms and container environments

      Motivation

      Current G1 heap sizing depends on garbage collection activity. Applications with infrequent garbage collections may retain committed memory unnecessarily during idle periods, leading to poor memory efficiency in containerized environments and multi-tenant systems. This enhancement addresses the limitation described in JDK-8357445 by providing regular heap size re-evaluation independent of GC cycles.

      Modern cloud-native applications often have variable workload patterns with significant idle periods, during which committed heap memory cannot be reclaimed by the operating system without garbage collection activity. This creates unnecessary memory pressure in containerized deployments and reduces overall system efficiency.

      Description

      This enhancement introduces a periodic background task that evaluates heap regions for uncommit based on inactivity time rather than GC-driven metrics.

      Key Components

      Region Activity Tracking: Each heap region maintains a timestamp that is updated when the region is cleared for reuse and when allocation regions are retired.

      Evaluation Task: A configurable periodic task (G1HeapEvaluationTask) evaluates regions for uncommit eligibility.

      Uncommit Policy: Regions are considered for uncommit if they are empty and have been inactive longer than a configurable delay period.

      Configuration

      Four new command-line flags control the behavior:

      -XX:+G1UseTimeBasedHeapSizing (default: false, EXPERIMENTAL)
        Enables time-based heap sizing. Requires -XX:+UnlockExperimentalVMOptions.
      
      -XX:G1TimeBasedEvaluationIntervalMillis=<value> (default: 60000, MANAGEABLE, range: 1000-3600000)
        Interval in milliseconds between heap evaluations. 
      
      -XX:G1UncommitDelayMillis=<value> (default: 300000, MANAGEABLE, range: 1000-7200000)
        Minimum time in milliseconds a region must be inactive before uncommit eligibility.
      
      -XX:G1MinRegionsToUncommit=<value> (default: 10, EXPERIMENTAL, range: 1-1000)
        Minimum number of eligible regions required before uncommit occurs.

      Implementation

      The evaluation task operates independently of GC cycles using existing G1 service task infrastructure. Region activity is tracked with minimal overhead at existing lifecycle transition points.

      Log Output

      When enabled, the feature produces logging output at initialization and during evaluation:

      Initialization:

      [2025-06-27T19:16:13.618+0000][info][gc,init] G1 Time-Based Heap Sizing enabled (uncommit-only): evaluation_interval=30000ms, uncommit_delay=60000ms, min_regions_to_uncommit=2

      Evaluation Activity:

      [2025-06-27T19:16:13.656+0000][debug][gc,sizing] Starting heap evaluation
      [2025-06-27T19:16:13.656+0000][debug][gc,sizing] Full region scan: found 0 inactive regions out of 1736 total regions
      
      [2025-06-27T19:17:43.659+0000][debug][gc,sizing] Uncommit candidates found: 63 inactive regions out of 1736 total regions
      [2025-06-27T19:17:43.659+0000][info ][gc,sizing] Time-based uncommit: found 63 inactive regions, uncommitting 10 regions (80MB)
      [2025-06-27T19:17:43.659+0000][info ][gc,sizing] Time-based evaluation: shrinking heap by 80MB
      [2025-06-27T19:17:43.662+0000][info ][gc,heap] Heap shrink completed: uncommitted 10 regions (80MB), heap size now 776MB
      
      [2025-06-27T19:19:13.669+0000][info ][gc,sizing] Time-based uncommit: found 271 inactive regions, uncommitting 61 regions (488MB)
      [2025-06-27T19:19:13.669+0000][info ][gc,heap] Heap shrink completed: uncommitted 61 regions (488MB), heap size now 4408MB
      
      [2025-06-27T19:22:13.686+0000][debug][gc,sizing] Starting heap evaluation
      [2025-06-27T19:22:13.686+0000][debug][gc,sizing] Full region scan: found 0 inactive regions out of 1736 total regions
      [2025-06-27T19:22:13.687+0000][info ][gc,sizing] Time-based evaluation: no heap uncommit needed (evaluation #10)

      Alternatives

      1. GC-Triggered Evaluation

        • Approach: Evaluate for uncommit during GC cycles
        • Rejected: Would impact GC pause times and couple sizing with collection frequency
      2. Allocation-Pressure Based

        • Approach: Trigger uncommit based on allocation rates
        • Rejected: Doesn't address idle period memory efficiency
      3. External Memory Pressure API

        • Approach: React to system memory pressure signals
        • Rejected: Platform-specific and less predictable for application tuning
      4. Statistics-Based Thresholds

        • Approach: Use heap utilization statistics over time windows
        • Rejected: More complex and less deterministic than time-based approach
      5. Immediate Uncommit

        • Approach: Uncommit regions immediately when they become empty
        • Rejected: Caused excessive memory thrashing in allocation-heavy workloads

      Testing

      JTReg Test Suite

      Core Functionality Tests:

      • TestG1RegionUncommit: Basic uncommit functionality and edge cases
      • TestTimeBasedHeapSizing: Overall feature behavior and integration
      • TestTimeBasedRegionTracking: Region activity tracking and lifecycle
      • TestTimeBasedHeapConfig: Parameter validation and configuration

      Stress Testing:

      • Long-running applications (24+ hour runs)
      • Concurrent allocation and uncommit scenarios
      • Extreme parameter configurations
      • Container resource limit boundary conditions

      Performance Benchmarking

      SPECjbb2015 Comprehensive Testing:

      • 120 benchmark runs across different parameter combinations
      • Parameter matrix testing: evaluation intervals (1s-300s), uncommit delays (1s-300s), region thresholds (1-50)
      • Key Results: 15-45% memory reduction during idle periods, <2% throughput variation, no GC pause time impact

      Platform Validation

      Multi-Platform Testing:

      • Linux (x86_64, aarch64): Ubuntu 22.04, Azure D4pdsv5 instances
      • Windows (x86_64): Windows 11
      • macOS (aarch64): macOS Sequoia 15.6
      • Container environments: Docker with Hyper-V backend

      Risks and Assumptions

      Performance Risks

      • Minimal Runtime Overhead: Activity tracking only occurs during region lifecycle events
        • Validated: SPECjbb2015 showed <0.1% allocation overhead across 120 test runs
      • Background Task Impact: Evaluation task runs at low priority with configurable intervals
        • Validated: No measurable impact on application threads across all platform tests

      Functional Risks

      • Memory Thrashing: Conservative uncommit policies and minimum thresholds prevent excessive commit/uncommit cycles
        • Mitigated: Default 5-minute delay and 10-region minimum threshold prevent thrashing
      • Race Conditions: Comprehensive synchronization using existing G1 heap locks
        • Validated: Extensive multi-platform, multi-architecture testing shows no race conditions

      Assumptions

      • Applications will have identifiable idle periods where memory can be safely reclaimed
        • Validated: SPECjbb2015 and long-running application tests confirm assumption
      • Time-based policies provide sufficient flexibility for diverse application patterns
        • Validated: Parameter matrix testing across 120 configurations shows adaptability

      Dependencies

      • G1 Garbage Collector implementation
      • HotSpot service task scheduling infrastructure
      • Platform-specific memory management APIs

      Acknowledgements

      This enhancement builds upon the existing G1 garbage collector infrastructure and benefits from the robust foundation provided by years of G1 development and optimization. The time-based approach complements existing sizing mechanisms while maintaining G1's performance characteristics.

            mbeckwit Monica Beckwith
            mbeckwit Monica Beckwith
            Monica Beckwith Monica Beckwith
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

              Created:
              Updated: