Improve the scalability of the TLS implementation by adding support for efficiently distributing and resuming TLS sessions across clusters of computers.
- Improve the scalability and throughput of the TLS implementation.
- Improve the performance of the SunJSSE provider for a multiple-nodes cluster by 20%.
- Improve the performance of the SunJSSE provider for a single node server by 5%.
Negotiating session parameters for TLS (in a full handshake) is expensive. Since clients frequently reconnect to the same server, TLS already supports efficiently reusing session credentials from a previous session between the same client/server. We wish to extend this benefit to reusing session credentials from a previous connection between the same client and an entire cluster, which will decrease server costs and increase application responsiveness.
In order to increase capacity (the number of concurrent users) and reliability, an application can be deployed on a cluster of servers, where network connections and traffic to the application are distributed across the cluster. The servers could be located in different locations, on different networks, or use different cloud VMs, containers, or other kinds of nodes. Distributed computation improves overall performance and reliability by decreasing the burden and dependency on an individual server in the system. Ideally, any server can be unplugged at runtime for replacement or upgrading, and new servers can be plugged in to extend the capacity.
A TLS connection is established via TLS handshaking. For an initial connection, the client and server negotiate the security parameters and then establish the security channel. The negotiation process of the security parameters is called a full handshake. Since many cryptographic operations are involved, the full handshake is costly. Fortunately, the negotiated parameters, which are also called session data, can be retained and reused for subsequent connections. The process of reusing the negotiated parameters is called an abbreviated handshake, or session resumption. Per this research, the overall cost of session resumption is 50% less than the full handshake, and the CPU cost is almost negligible (less than 5%) compared to the full handshake.
We wish to extend the benefit of session resumption from connections between the same client and server to connections between the same client and an entire cluster.
Define a more distribution friendly session ticket protection scheme.
In order to resume the session, the negotiated parameters must be stored somewhere, such as in the server's cache or in a protected session ticket. A session ticket is a block of data that is generated and protected by the server, but is not cached on the server side. The negotiated parameters could be encapsulated and encrypted in the session ticket and delivered to the client for session resumption. The client will send back the exact session ticket in its session resumption request. The server retrieves the negotiated parameters by decapsulating and decrypting the received session ticket.
To support distributed session resumption, a session ticket that is generated and protected in one server node must be usable for session resumption on other server nodes in the distributed system. Each node should use the same session ticket structure, and share the secrets that are used to protect session tickets.
The session ticket processes are defined in RFC 5077 for TLS 1.2 and prior versions, and RFC 8446 for TLS 1.3. However, the RFCs do not define how to construct and protect the session ticket. Currently, the session ticket generated in the JDK can be used with the server that generated it. We wish to make this mechanism more distribution friendly to improve scalability and responsiveness of applications.
A session ticket protection scheme will be designed and implemented in the SunJSSE provider. The scheme will support key generation, key rotation and key synchronization across clusters of computers. By using the new session ticket protection scheme, the SunJSSE provider will be updated to support distributed session resumption.
Testing will cover the following areas:
- Verifying that there is no compatibility impact.
- Verifying that there is no interoperability impact.
- Verifying that the performance is improved.
- Verifying that the session tickets generated and protected in one server node can be used for session resumption in other nodes in a distributed system.
- Verifying that the secret keys used to protect the session ticket can be rotated and synchronized.
- Verifying that a new server node inserted into the distributed system can be automatically synchronized, thus making it possible to plugin new server nodes as needed.
This is an improvement of the TLS 1.3 implementation, JEP 332.