-
Enhancement
-
Resolution: Fixed
-
P4
-
None
-
b06
In our current GHA workflows, we only run workflows in branches in personal forks. GHA isolation rules say that workflow caches from the parent branches can be used by descendant branches. For our branches, the usual parent is "master". Since we do not run workflows on "master", this means every time we create a new branch, GHA would start with logically empty caches for it. Only the next trigger on the same branch would use the caches, saved from the first workflow run.
This means we put additional load on shared infrastructure with pulling JDKs, building jtreg (and pulling its dependencies), bootstrapping sysroots, etc. All these steps also fail intermittently every so often. It also means everyone carries lots of caches around, segregated by branch (look into your https://github.com/<id>/jdk/actions/caches), only relying on cache cleanups when it starts to hit 10 GB. With 200+ contributors, this is easily 2 TB of cloud space we effectively waste in GHA clouds.
--- Post integration status: ---
This improvement introduces the notion of "dry run", which does everything except the actual builds and tests. Therefore, it verifies whether all dependencies are done properly for JDK configure to pass. This is useful in itself for future GHA debugging of dependencies. Workflow can be dispatched with additional "dry run" parameter now.
What makes master-branch caching possible is the second part of the change that hooks up dry runs to master/stabilization branch pushes. These would make the dry-run workflow run every time you update your personal fork's master/stabilization branch. That dry run would likely finish very quickly if all caches are already in place. It would populate caches in master/stabilization branch in your personal fork, if not.
The expected net result is that actual PRs that are branched off the personal fork master would be able to use the caches from that master workflow run.
There is an implication in this caching scheme: one needs to sync their personal fork master every so often for master->branches caches to work. If your workflow already includes that step, you are good. You can either fetch/push to remote master in personal fork, or press a magic "Sync fork" button in your personal fork UI on GitHub.
If, however, the remote master in your personal fork is not updated, the dry-run jobs would not run, and master->branches caching would not work. It would still fall back to old caching behavior. This would be a normal outcome if e.g. you are only fetching remote upstream master locally, directly into your feature branch. Maybe the availability of this kind of caching would prompt workflow adjustments for some :)
Keeping masters in perfect sync is not needed, IMO, as GHA caches are keyed by .github contents hash. So caches would become stale only when GHA code is changed, which does not happen all that often.
This means we put additional load on shared infrastructure with pulling JDKs, building jtreg (and pulling its dependencies), bootstrapping sysroots, etc. All these steps also fail intermittently every so often. It also means everyone carries lots of caches around, segregated by branch (look into your https://github.com/<id>/jdk/actions/caches), only relying on cache cleanups when it starts to hit 10 GB. With 200+ contributors, this is easily 2 TB of cloud space we effectively waste in GHA clouds.
--- Post integration status: ---
This improvement introduces the notion of "dry run", which does everything except the actual builds and tests. Therefore, it verifies whether all dependencies are done properly for JDK configure to pass. This is useful in itself for future GHA debugging of dependencies. Workflow can be dispatched with additional "dry run" parameter now.
What makes master-branch caching possible is the second part of the change that hooks up dry runs to master/stabilization branch pushes. These would make the dry-run workflow run every time you update your personal fork's master/stabilization branch. That dry run would likely finish very quickly if all caches are already in place. It would populate caches in master/stabilization branch in your personal fork, if not.
The expected net result is that actual PRs that are branched off the personal fork master would be able to use the caches from that master workflow run.
There is an implication in this caching scheme: one needs to sync their personal fork master every so often for master->branches caches to work. If your workflow already includes that step, you are good. You can either fetch/push to remote master in personal fork, or press a magic "Sync fork" button in your personal fork UI on GitHub.
If, however, the remote master in your personal fork is not updated, the dry-run jobs would not run, and master->branches caching would not work. It would still fall back to old caching behavior. This would be a normal outcome if e.g. you are only fetching remote upstream master locally, directly into your feature branch. Maybe the availability of this kind of caching would prompt workflow adjustments for some :)
Keeping masters in perfect sync is not needed, IMO, as GHA caches are keyed by .github contents hash. So caches would become stale only when GHA code is changed, which does not happen all that often.
- duplicates
-
JDK-8361288 Fix build of JTReg: wget exited with exit code 4
-
- Closed
-
- links to
-
Commit(master) openjdk/jdk/1fa772e8
-
Review(master) openjdk/jdk/26134