Details
-
JEP
-
Status: Draft
-
P1
-
Resolution: Unresolved
-
None
-
None
-
None
-
Ron Pressler, Alex Buckley
-
Informational
-
Open
-
SE
Description
Summary
The Java Platform assures the integrity of code and data with a variety of features that are on by default. Strong encapsulation is one such feature, but it can be circumvented by some APIs, causing headaches for maintenance and performance. As Java continues to move forward, it is appropriate to restrict all APIs so that they cannot break strong encapsulation without explicit user permission, while still accommodating use cases that need to operate beyond encapsulation boundaries.
Goals
Allow the Java Platform to robustly maintain invariants – of its own operation as well as that of Java applications – required for maintainability, security, and performance.
Clarify the plethora of Java and non-Java APIs that can break strong encapsulation.
Differentiate use cases where breaking encapsulation is convenient from use cases where disabling encapsulation is essential.
Non-Goals
- It is not a goal to guard against situations where users compromise integrity of the Java Platform by manipulating the underlying file system, operating system, or hardware. Appropriate integrity mechanisms in the operating system, container, etc, should always be used to protect the Java Platform and Java applications.
Motivation
Integrity in the Java Platform
Over the past few years, the Java Platform has been inching toward a vision of greater integrity. Integrity is the guarantee that a property established at one point in the program applies at all points in the program. For example, the array creation new int[10]
establishes a property about the array — it is never read or written past its tenth element — and this property has integrity because the JVM guarantees the array bound is respected. Developers can rely on this property (which we call an integrity invariant) without having to analyze every line of code to confirm that it applies, as they would for a C program with a similar array creation. Integrity, therefore, enables local reasoning about a program's correctness.
Here are some other integrity invariants offered by the Java Platform:
A program's initial state is well defined because the JVM guarantees that variables and arrays are initialized before use.
A program never suffers from "use after free" because the JVM provides automatic memory management.
A program cannot perform invalid operations on data (e.g., a
String
cannot be cast to aSocket
) because Java programs are type-safe.In
java.io
andjava.nio
, the meaning of a relative file path is stable throughout a program's lifetime because the API does not allow the current directory to be changed (there is nochdir
method).In 2023, Java added a new integrity invariant: a guarantee that no unexpected errors (other than VM errors) can occur at arbitrary points in the program. This was accomplished by disabling <code class="prettyprint" data-shared-secret="1695409846217-0.709199202674553">Thread.stop</code> in JDK 20.
Integrity invariants are safety properties that prevent "bad things" from happening. As such, we only notice how much we depend on integrity when things go wrong. The local reasoning enabled by integrity is important not only for developers, but also for the JVM as it analyzes and optimizes running code (to be discussed later). Integrity, therefore, is essential for the Java Platform's own operation.
Encapsulation as the Foundation of Integrity
The integrity invariants listed above are established by the Java Platform; they ensure that Java, unlike C, does not have undefined behavior. Java developers also want to establish integrity invariants: properties of their own code, and their own data, which are guaranteed to apply throughout the program. To do this, developers use access control modifiers – public
, private
, protected
, and the default "package" access – to protect code and data declared in one part of the program from other parts. For example, suppose a developer wants to establish the invariant that the state of a counter-like object is always even, never odd. The class could be written as follows:
public final class Even {
private int x = 0;
public int value() { return x; }
public void incrementByTwo() { x += 2; }
public void decrementByTwo() { x -= 2; }
}
By declaring x
as private
and having all the public
methods preserve the parity of x
, the developer has used encapsulation to establish the invariant that every Even
object in the program has even state. The integrity of this domain-specific invariant (the state is always even) relies on the integrity of encapsulation (when x
is private, absolutely no code outside Even
can touch it).
Encapsulation is a cornerstone of programming in the large because it allows a program to be constructed from independently-developed components that interact only through their public
APIs, each of them can be reasoned about in isolation. It is this ability that allows both individual Java programs and the entire Java ecosystem to scale as collections of independent, interoperating components.
From Encapsulation To Strong Encapsulation
Unfortunately, the parity invariant above does not have the integrity that the developer might hope for. This is because any code on the class path could employ deep reflection to override access control (the private
modifier on x
) and assign an odd value to x
directly. Deep reflection has existed since JDK 1.1, when the method <code class="prettyprint" data-shared-secret="1695409846217-0.709199202674553">java.lang.reflect.AccessibleObject.setAccessible</code> was introduced.
Given the possibility of some code calling this method, it would take a global analysis of the codebase to ensure that Even
's parity is, indeed, an invariant. The developer might assume that no code would intentionally break the invariant, but it could be broken unintentionally. For example, another developer in the organization could decide to serialize and deserialize instances of Even
to and from JSON using a library. When deserializing JSON input, the library would bypass Even
's public API and use deep reflection to set the value of x
. If the JSON input contains an odd number, the invariant will be broken.
As a result of deep reflection – and other mechanisms that disregard or bypass encapsulation, to be discussed later – the meaning of Java code is provisional, and the encapsulation merely advisory. A method or field is private
, unless other code really wants to access it. A final
field is assigned once, unless other code wants to assign it again later. The meaning of a method is defined by a block of code, unless other code decides to redefine the method later (this involves an agent, which is a class with access to a special API that allows it to change other Java code). This provisionality is not hypothetical: Some libraries change the meaning of code outside them in arbitrary ways, so that neither a person reading the code nor the Java Platform itself can believe that the code does what it says or that its meaning does not change over time as the program runs.
To allow developers to use encapsulation to truly establish integrity invariants, JDK 9 introduced modules. A module is a set of packages, some of which are designed to be used outside the module (they are exported), while others are designed to be used only inside the module (they are unexported). Everything in an unexported package is strongly encapsulated – deep reflection cannot break in. Similarly, the non-public
elements of exported packages are also strongly encapsulated. Since x
is a private
field, strong encapsulation allows the parity invariant that is established locally in the Even
class to be trusted globally.
Strong encapsulation gives integrity to encapsulation – it guarantees no one outside the class can assign x
– and in so doing it gives integrity to the invariant that x
is always even. Strong encapsulation offers a solid foundation to build on. Without it, code is a castle in the sand.
Other than making it easier to establish business-logic invariants important for a program's correctness, strong encapsulation is beneficial for three general reasons:
Maintainability: Strong encapsulation protects the integrity of code as it evolves. When evolving Java code, developers assume that
private
implementation details, encapsulated from clients, can be safely changed. For example, every developer assumes that changing the signature of aprivate
method, or removing aprivate
field, does not impact the class's clients.Security: Strong encapsulation is essential for constructing any kind of robust security, whether in the application, a library, or the JDK. Suppose that a class restricts a sensitive operation as follows:
if (isAuthorized()) doSensitiveOperation();
The restriction is robust only if we can guarantee that
doSensitiveOperation
is only ever invoked after a successfulisAuthorized
check. This invariant is established by the enclosing class declaringdoSensitiveOperation
asprivate
and preceding all calls to it with anisAuthorized
check. However, with deep reflection,doSensitiveOperation
could be invoked from anywhere without anisAuthorized
check, nullifying the intended restriction; even worse, an agent could modify the code of theisAuthorized
method to always returntrue
. Without strong encapsulation, a global analysis of the codebase, including the application's direct and transitive dependencies, is required to guarantee that security invariants hold in every circumstance. The circumvention of security invariants need not be intentional; a vulnerability in a library that breaks encapsulation, or in any other library that uses that library, jeopardizes any security invariant anywhere in the application.Performance: In the Java runtime, certain optimizations assume that conditions that hold at the time the optimization is made hold forever. For example, the JVM can perform powerful optimizations when it knows that the value of a field will never change – not only constant-folding but also shifting the time of the initialization of final fields. This can only be done if the "finality" of
final
fields cannot be overridden by any mechanism (assuring the finality offinal
fields is a complicated subject, but strong encapsulation makes it easier). Additionally, further optimization of methods could be performed when all of their call sites are known. This could be guaranteed for strongly encapsulated methods, as they can only be called from inside their module. A tool likejlink
could remove unused strongly-encapsulated methods at link time to reduce image size and class loading time. The guarantee that code may not change over time even opens the door to ahead-of-time compilation (AOT).
Strong Encapsulation by Default
Because Java since JDK 1.1 had allowed encapsulation to be broken via deep reflection, a number of libraries intended for use in production came to depend on the ability to break it. The reasons for breaking encapsulation were varied:
A client may desire functionality that isn't exposed through an API. For example, a client of the
Even
class may want to implement a method,Even add(Even a, Even b)
, that returns a newEven
object whosevalue
is the sum ofa
's andb
's. Finding it hard to implementadd
by callingpublic
methods of theEven
class, the programmer opts to employ deep reflection to set thex
field of the resultingEven
instance directly. Many libraries use deep reflection to access JDK classes whose APIs were not intended for general use, such as the classessun.misc.BASE64Encoder
andBase64Decoder
, the packagesun.security.x509
, the packages undercom.sun.net.ssl
, and the packages undercom.sun.image.codec.jpeg
. Developers encroached on the encapsulation of JDK internals because it was convenient; the resulting loss of integrity was less concerning than, for example, the addition of a dependency on Apache Commons.Internal access may be needed to work around a bug before it is fixed. This, for example.
Internal functionality could offer better performance. For example, a client of the
Even
class might want to increment the counter by 100; finding fifty calls toincrementByTwo
too slow, they use deep reflection to update the x field directly. As another example, libraries usesun.misc.Unsafe
to compare-and-set a field atomically and quickly.
Libraries that broke encapsulation supplied needed and useful functionality to the Java ecosystem. Application developers benefited from the functionality of libraries deep in their dependency tree that broke encapsulation, and at the same time enjoyed the encapsulation in their own code. But that encapsulation was illusory. If even a single library used by an application could arbitrarily bypass encapsulation, none of the integrity invariants established through encapsulation in the entire application could be relied upon. These benefits are, unfortunately, contradictory, and applications must be allowed to choose between them.
Because deep dependency trees are common, the chances are high that an application would unknowingly depend on an encapsulation-breaking library. Consequently, if applications had to opt into strong encapsulation, few would be able to do so. The platform must, therefore, exert pressure on the ecosystem to minimize the proliferation of libraries that bypass strong encapsulation by making strong encapsulation opt out rather than opt in.
Strong encapsulation must be the default. This is the goal the Java Platform is approaching.
JDK 9 accommodated those libraries that broke encapsulation by only enforcing strong encapsulation at compile time; meanwhile, at run time, deep reflection was permitted, with "illegal reflective access" warnings to encourage maintainers to prepare libraries for strong encapsulation. Official replacements for the internal JDK classes above were added to the JDK, massively reducing the need to break encapsulation on modern JDKs. The VarHandle
API and the ongoing work on Foreign Function & Memory API make uses of sun.misc.Unsafe
obsolete. Legacy bugs have been fixed so it is exceptionally rare to need to break encapsulation to work around them. Library developers wishing to target both new and old JDKs can easily do so using a Multi-Release JAR.
In 2021, JDK 16 began enforcing strong encapsulation at run time, turning the warnings into errors. Applications that encounter access errors due to encapsulation-breaking libraries must update them to versions that don't access JDK internals.
Disabling Strong Encapsulation
As a practical matter, some libraries haven't been updated to run on JDK 16 and above, but it's necessary to run them on JDK 16 and above anyway. The circumstances for breaking encapsulation, unfortunately, persist.
In addition, there are some tools and libraries whose functionality fundamentally operates beyond encapsulation boundaries. Here are a few examples:
White-box testing and related techniques, such as mocking, require direct access to encapsulated code and may even require changing its internal logic. This use case is only relevant during development, not production.
Frameworks may offer functionality, such as dependency injection, that requires operating beyond the encapsulation boundaries of their client classes.
Application Performance Monitoring (APM) and other advanced observability tools may require observing code internals by instrumenting the arbitrary code to emit tracing events using a JVM TI agent or a Java agent.
To balance the need for integrity with both the circumstantial, convenience uses of JDK internals and the essential uses, Java gives the user – the application's owner (typically its author, maintainer, or deployer) – the final say on which strong encapsulation boundaries are in place and which should be ignored. This freedom is offered under the guiding principle that the ability of one component to encroach on the boundaries of another must be explicitly granted by the application. Libraries cannot choose to obtain encapsulation-busting "superpowers" without the knowledge and consent of the application's owner.
Integrity by default, therefore, means that integrity may be broken – but only with the user's consent.
This consent can be granted as follows:
For temporary uses of convenience, the application can employ the
--add-opens
or--add-exports
command-line options to allow code in one module to disable strong encapsulation and access strongly encapsulated classes and members in another module. This should only be done as a last resort. If an application's startup script contains, for example:--add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED
then it is a red flag indicating that libraries on the class path have not been kept up-to-date and are not portable to modern JDKs.
For white-box testing of code in user modules, build tools and testing frameworks should automatically emit
--add-exports
,--add-opens
, and--patch-module
for the module under test, as appropriate (for example, patching the module under test with the contents of the test module allows the testing of package access methods).Frameworks should not rely
--add-opens
but rather have their client classes grant them encapsulation-breaking privileges. This could be done either declaratively in the client module withopens pkgName to acmeFramework
(the framework may then transfer that permission with <code class="prettyprint" data-shared-secret="1695409846217-0.709199202674553">Module.addOpens</code>), or programmatically with an appropriateMethodHandle.Lookup
, a *capability object* that captures the client class's own access permissions. For example, a client class could grant such privilege in a static initializer as follows:static { AcmeFramework.grantAccess(MethodHandles.lookup()); }
.APM tools should require the application to deploy their agents with the
-javaagent
or-agentlib
option. This explicitly grants the agent permission to instrument and modify classes. Mocking libraries that employ an agent to change classes' behavior should do the same.
Integrity requires that libraries must not encroach on other components without the application's consent; otherwise, the boundaries on the map -- and so the attack surface area of the application, its maintenance risk, and the optimizations that can be performed -- would be unknowable. When only the application is permitted to explicitly grant "superpower" privileges, the application's authors are able to better judge what risks affect them and to better control the attack surface area of the application. The command line serves as an auditable map of the codebase and its internal encapsulation boundaries that the application draws as it wishes.
Disabling strong encapsulation imposes risks:
A library, however well-meaning, that is granted the privilege to break the strong encapsulation of the JDK's modules is able to make use of internal JDK classes that are not subject to backward compatibility, making it non-portable. Such a library may break without warning on any release of the JDK (including patch releases) – as it may use, say, a
private
method whose signature has changed – and so poses a maintenance risk for an application that uses it.Some performance optimizations may be hard or impossible to do when the application's author chooses to ignore a module's boundaries.
Strong encapsulation provides security bulkheads that restrict a vulnerability in one component from affecting others. Granting access permissions can make the application vulnerable as discussed above. If library
A
is granted the permission to perform deep reflection on the package wheredoSensitiveOperation
happens to reside and libraryB
employs libraryA
, a vulnerability inB
may allow a remote attacker to directA
to calldoSensitiveOperation
without the access check.These risks accrue when the list of
--add-opens
isn't properly documented and maintained. Indeed, command-line options may be perpetuated by habit even when they're no longer needed (an application could upgrade a library that used to require a particular--add-opens
but no longer does, and the option is not removed).
Overall, the burden of responsibility imposed on application maintainers who find themselves having to maintain encapsulation-disabling permissions is nowhere near as high as the cost that lacking integrity by default places on the platform and the ecosystem. A palpable demonstration of that cost was the difficulty many applications experienced when migrating from JDK 8 to later versions, which was predominantly caused by non-portable libraries.
The experience of the past few years has shown that the ecosystem is able to adapt to strong encapsulation -- at least of the JDK itself. Most Java code, which resides in applications, has never had much need to directly access JDK internals; high-level libraries and frameworks have similarly rarely reached into the innards of the JDK. Code that breaks encapsulation is usually found in low-level libraries that would normally be transitive dependencies of applications, and many libraries that had previously depended on JDK internals have stopped doing so. The impact on the ecosystem has mostly been that applications were required to upgrade their dependencies. Simultaneously, the burden placed on applications to grant libraries "superpower" privileges has put pressure on libraries to reduce their reliance on deep reflection and similar capabilities.
Beyond Deep Reflection
Integrity by default has not yet been achieved because strong encapsulation is not yet universal in the Java Platform. Some APIs allow any library to surreptitiously claim integrity-violating superpowers for itself, without the application's explicit consent, and use these superpowers to break encapsulation. Any library can:
Use
sun.misc.Unsafe
to access and modifyprivate
fields.Load a native library that employs JNI to call
private
methods and setprivate
fields. (The JNI API is not subject to access checks.)Load an agent that changes code in a running application, using an API intended for tools only.
It is worth mentioning that sun.misc.Unsafe
is able to break not only strong encapsulation but even Java's most foundational integrity mechanisms, mentioned earlier. For example, a library using Unsafe
can access arrays without bounds checking, and can access an object that has been deallocated by the garbage collector; accordingly, a program utilizing Unsafe
may have undefined behavior. Much of the same applies to programs which make use of native code via JNI or the "Linker" component of the Foreign Function & Memory API, although that undefined behavior is caused not by Java code but by native code.
These APIs mean that Java does not yet provide integrity by default. Invariants can be relied upon neither by people nor by the platform itself. In particular, security can only be achieved with a difficult, often infeasible, global analysis of the application and its dependencies, as a vulnerability in any direct or transitive dependency could potentially be exploited and turned into a gadget that circumvents any authorization check in the application. Additionally, application authors are unable to know whether one of their dependencies relies on internal implementation details of the JDK, making the application unable to easily upgrade a JDK version.
To attain our goal of integrity by default, we will gradually restrict these APIs and close all loopholes in a series of upcoming JEPs, ensuring that no library can assume superpowers without the application's consent. Libraries that rely on these APIs should spend the time remaining until they are restricted to prepare their users for any necessary changes.
Why Now?
An obvious question: Why has the Java Platform been progressing toward integrity by default over the past few years, putting obstacles in the path of some clever, occasionally-useful tricks, when applications managed fine without strong encapsulation for two decades?
The answer is that Java must adapt to changing circumstances and requirements:
The platform is able to enforce primitive integrity invariants – such as the invariant that all arrays are initialized – because those invariants are maintained in native code deep inside the JVM and are therefore unaffected by encapsulation-breaking capabilities of Java code. However, more and more of the Java runtime is being written (or rewritten) in Java. For example, legacy I/O and the implementation of reflection have been rewritten in Java, while the virtual thread scheduler is written in Java and the monitors used by
synchronized
code are expected to be rewritten in Java in the future. (These two are important to maintain the Java Memory Model.) In future, the JVM's JIT compiler may be written in Java; breaking its encapsulation could violate any and all invariants made by the platform. In a nutshell, even the integrity of the platform's basic operations is increasingly reliant on strong encapsulation.We had to make the JDK more maintainable and remove obsolete packages to be able to add new features without drowning in maintenance. Not only had the JDK itself become a Big Ball of Mud, but entire layers of libraries that reached into the JDK's innards threw themselves into the same sticky mess. That resulted in a serious evolution problem as the ecosystem ossified around a specific JDK version, which manifested in the difficulty migrating from JDK 8 to later versions of the JDK. Continuing to evolve the JDK, let alone at a faster pace, would have created such difficulties with every release, forever. The choice was between inflicting migration pain just once more by encapsulating the JDK's internals and stopping the evolution of Java.
Java's primary security threats have shifted from untrusted code running in the client to remote attacks on servers, which made the Security Manager an ill-suited solution. But we need a mechanism to allow the construction of robust security in layers above the JDK. (Because it is essential for security, the Security Manager did offer strong encapsulation, though not by default, and configuring it correctly in practice was difficult).
There is a growing demand for performance optimization of startup time and image size that are important for deploying Java applications in some emerging environments. Such optimizations require that code does not change its meaning from build time to runtime.
In short: The evolution of the JDK caused serious migration issues, there was no practical mechanism that enabled robust security in the current landscape, and new requirements could not be met.
Despite the convenience that lack of integrity has offered to "superpowered" libraries, the situation is untenable. Strong encapsulation is the linchpin of the solutions. The effort to add strong encapsulation to Java began in the 2010's, but its importance is becoming clearer with every passing year, so the effort continues.
Conclusion
Integrity is a solid foundation for the Java Platform and its vast ecosystem. It is a prerequisite for maintainability, robust security, and a number of optimizations that are in growing demand. Integrity by default has not yet been achieved due to loopholes that allows a library to break strong encapsulation without the application's explicit consent.
Integrity can be the default. The last few years have proven that the vast majority of code does not require breaking encapsulation. In special circumstances, it is useful to selectively disable encapsulation and the Java platform allows it, but only with the user's consent so that risks can be considered.
Integrity must be the default. We have seen the effect of it not being the default when, prior to strong encapsulation, libraries reaching for JDK internals ossified the ecosystem around a particular JDK version, making upgrades difficult.
Attachments
Issue Links
- relates to
-
JDK-8061972 JEP 261: Module System
-
- Closed
-
-
JDK-8132928 JEP 260: Encapsulate Most Internal APIs
-
- Closed
-
-
JDK-8255363 JEP 396: Strongly Encapsulate JDK Internals by Default
-
- Closed
-
-
JDK-8263547 JEP 403: Strongly Encapsulate JDK Internals
-
- Closed
-
-
JDK-8306275 JEP 451: Prepare to Disallow the Dynamic Loading of Agents
-
- Closed
-
-
JDK-8307341 Prepare to Restrict The Use of JNI
-
- Draft
-