Loading...

XML

Word

Printable

Type: Enhancement
Resolution: Fixed
Priority: P4
Fix Version/s: 27
Affects Version/s: None
Component/s: security-libs
Labels:
None

Subcomponent:
javax.crypto
Resolved In Build:
master
CPU:

generic
OS:

generic

Create on behalf wuxinyang@hygon.cn

The current implementation of AES in ECB mode still uses a per-block intrinsic approach with loop invocation, incurring superfluous invocations and context-switching overhead. We suggest introducing a full plaintext/ciphertext intrinsic stub and further optimizing it with parallel RoundKey addition.

===========================
Dear Security group and members,

Hello,

I recently submitted a PR that introduces a parallel intrinsic implementation for AES/ECB operations, aiming to replace the current per-block processing approach and improve performance for multi-block encryption/decryption.

This work is motivated by several performance limitations in the existing AES/ECB implementation (except for AVX-512 support):

   1.

   *Excessive stub call overhead* ? each 16-byte block triggers a separate
   intrinsic call, leading to high invocation frequency.
   2.

   *Limited instruction-level parallelism* ? serialized block processing
   does not fully utilize available ILP.
   3.

   *Redundant setup and teardown* ? encryption state is repeatedly
   initialized for every block.

Summary of changes

   -

   Added a parallel AES intrinsic implementation to process multiple blocks
   in a single native call.
   -

   Reduced intrinsic invocation overhead.
   -

   Improved utilization of instruction-level parallelism.

Performance results (JMH)

Test platform: Intel(R) Core(TM) i9-14900HX OpenJDK 17 baseline:

Benchmark Mode Cnt Score Error Units
AesTest.test avgt 5 13334.163 ? 220.891 ns/op

With optimized implementation:

Benchmark Mode Cnt Score Error Units
AesTest.test avgt 5 10391.371 ? 94.966 ns/op

This shows approximately *28.3% performance improvement*.

I would greatly appreciate your feedback on:

   -

   The design of the parallel intrinsic approach
   -

   Any potential correctness or portability concerns
   -

   Suggestions for further optimization or alignment with HotSpot intrinsic
   conventions

JBS Issue: https://bugs.openjdk.org/browse/JDK-8376164 ? This issue tracks the performance improvement of AES/ECB operations by introducing a parallel intrinsic to reduce per-block overhead and enhance throughput.

I am very happy to revise or extend the patch based on your guidance.
Thank you for your time and for maintaining such a great platform.

Best regards,
Xinyang Wu

links to

Commit(master) openjdk/jdk/3e9fc5d4

Review(master) openjdk/jdk/29385

Assignee:: Sendao Yan
Reporter:: Sendao Yan
Votes:: 0 Vote for this issue
Watchers:: 5 Start watching this issue

Created:: 2026-01-22 22:49
Updated:: 3 hours ago
Resolved:: 3 hours ago

Details

Description

Attachments

Issue Links

Activity

People

Dates