As suggested in https://github.com/animetosho/md5-optimisation?tab=readme-ov-file#dependency-shortcut-in-g-function, we can delay the dependency on 'b' by recognizing that the ((d & b) | (~d & c)) is equivalent to ((d & b) + (~d & c)) in this scenario, and we can perform those additions independently, leaving our dependency on b to the final addition.
I see around 5% speedup on my x86 and aarch64 hosts on org.openjdk.bench.javax.crypto.full.MessageDigestBench.
```
Before:
Benchmark (algorithm) (dataSize) (provider) Mode Cnt Score Error Units
MessageDigestBench.digest MD5 1048576 thrpt 10 636.389 ± 0.240 ops/s
After:
Benchmark (algorithm) (dataSize) (provider) Mode Cnt Score Error Units
MessageDigestBench.digest MD5 1048576 thrpt 10 671.611 ± 0.226 ops/s
```
I see around 5% speedup on my x86 and aarch64 hosts on org.openjdk.bench.javax.crypto.full.MessageDigestBench.
```
Before:
Benchmark (algorithm) (dataSize) (provider) Mode Cnt Score Error Units
MessageDigestBench.digest MD5 1048576 thrpt 10 636.389 ± 0.240 ops/s
After:
Benchmark (algorithm) (dataSize) (provider) Mode Cnt Score Error Units
MessageDigestBench.digest MD5 1048576 thrpt 10 671.611 ± 0.226 ops/s
```
- links to
-
Commit(master) openjdk/jdk/1cf26a51
-
Review(master) openjdk/jdk/21203