It may be possible to tighten the NMT lock scope in os::release_memory and os::uncommit_memory by swapping the order of pd_release_memory and record_virtual_memory_release (or pd_uncommit_memory and record_virtual_memory_uncommit).
The new process for release could be similar to below:
Thread A:
lock
update NMT
unlock
pd_release
Thread B:
pd_reserve
lock
update NMT
unlock
As long as Thread_A has not yet released the reservation, Thread_B cannot race to re-reserve the same range. So Thread B can take its time to first update NMT under the lock, and then perform the release without the protection of the lock.
With regard to uncommit, the races wouldn't be prevented by the OS, but external synchronization should prevent re-committing over an already committed region.
Thread A:
lock
update NMT
unlock
pd_uncommit
<Synchronizing operation to signal that the region is no longer committed>
Thread B:
<Synchronizing operation to learn that the region is available to be committed>
pd_commit
lock
update NMT
unlock
This enhancement depends on being able to fail fatally if pd_release or pd_uncommit fails (see JDK-8353564). If there are scenarios in which we cannot fail fatally and we must instead recover, then we'd have to readjust NMT accounting for the changes made eagerly before the pd_release/pd_uncommit failure.
See original discussion: https://github.com/openjdk/jdk/pull/24084#issuecomment-2752240832
The new process for release could be similar to below:
Thread A:
lock
update NMT
unlock
pd_release
Thread B:
pd_reserve
lock
update NMT
unlock
As long as Thread_A has not yet released the reservation, Thread_B cannot race to re-reserve the same range. So Thread B can take its time to first update NMT under the lock, and then perform the release without the protection of the lock.
With regard to uncommit, the races wouldn't be prevented by the OS, but external synchronization should prevent re-committing over an already committed region.
Thread A:
lock
update NMT
unlock
pd_uncommit
<Synchronizing operation to signal that the region is no longer committed>
Thread B:
<Synchronizing operation to learn that the region is available to be committed>
pd_commit
lock
update NMT
unlock
This enhancement depends on being able to fail fatally if pd_release or pd_uncommit fails (see JDK-8353564). If there are scenarios in which we cannot fail fatally and we must instead recover, then we'd have to readjust NMT accounting for the changes made eagerly before the pd_release/pd_uncommit failure.
See original discussion: https://github.com/openjdk/jdk/pull/24084#issuecomment-2752240832