-
Enhancement
-
Resolution: Unresolved
-
P4
-
None
-
None
-
None
-
Fix Understood
The JDK bundles the original zlib implementation [1] almost from the very beginning, starting with zlib 1.0.4 in JDK 1.1. In the meantime, the bundled zlib implementation was updated to the latest released version 1.2.11. However, the innovation velocity in the original zlib version has considerably slowed down in the last years.
Lately, new projects have picked up the original zlib implementation and considerably improved its deflate and inflate speed. Testing shows that compression throughput can be improved between 50 and 100% with Cloudflare's zlib version [2] and decompression throughput between 30 and 60% using the zlib implementation from the Chromium project [3].
Overview
--------
This enhancement proposes some simple changes to the JDK's zlib wrapper libzip which make it possible to choose alternative zlib implementations selectively for deflation and/or inflation at VM startup based on a system property. As an additional benefit it makes it possible to use the system zlib version even if the JDK was configured with a bundled zlib.
Already now, the JDK can be configured at build time to either use the bundled zlib version or to dynamically link against the system zlib. If linked dynamically, LD_LIBRARY_PATH or LD_PRELOAD can be used to run the JDK with an alternative zlib implementation. This approach is however not very flexible because:
- once configured with a "bundled" zlib it's not possible to use another implementation any more. Some platforms like Windows don't have a default, system provided zlib version so they are always built with "--with-zlib=bundled" and therefore can't change the implementation at runtime.
- even if configured with "--with-zlib=system", it is only possible a single alternative version as a substitute for the system zlib. But we've found (see benchmarks [4]) that there's no single zlib implementation which improves both compression and decompression equally well. It may therefore make sense to have the possibility to selectively use one implementation for compression and another one for decompression.
This enhancement proposes three new system properties:
org.openjdk.zlib.implementation
org.openjdk.zlib.implementation.inflate
org.openjdk.zlib.implementation.deflate
which can be used to either replace the full zlib functionality or selectively just the deflate or inflate part with a new implementation. As an argument they take an absolute or relative path to an API-compatible, shared zlib library. E.g.
-Dorg.openjdk.zlib.implementation=zlib-cloudflare.so
-Dorg.openjdk.zlib.implementation.deflate=/Git/zlib-bench/lib/x86_64/zlib-cloudflare.dylib
-Dorg.openjdk.zlib.implementation.inflate='D:\Git\zlib-bench\lib\x86_64\zlib-chromium.dll`
An argument with the value "system" can be used to instruct the JDK to load the default zlib version on the corresponding system. Notice that this option now also works in the case where the JDK was configured with a bundled zlib.
The naming of the properties is of course subject to discussion. If we get an agreement on this change and the naming, I'll be happy to open a CSR for the introduction of the new properties.
For testing purpose I've made some precompiled version of zlib-cloudflare and zlib-chromium for Linux/x86_64, Linux/aarch64, MacOS/x86_64 and Windows/x86_64 available in my GitHub repository [6].
The implementation of this whole enhancement is fairly simple. Instead of calling right into zlib from libzip I've replaced all direct calls by indirect calls through function pointers. By default, these function pointers are pointing either to the corresponding functions in the bundled zlib implementation or to the corresponding functions in the dynamically linked system zlib. If none of the above system properties will be used, the only change to the current implementation is the replacement of some direct calls by indirect calls. Because these function calls are usually made through JNI and because the underlying code is usually quite compute intensive, I couldn't measure any performance regression because of this change.
If one of the above system properties will be set, the selected shared library will be dynamically loaded and the corresponding function pointers will be redirected to point to the new implementation. I have verified that this works on Linux, Windows and Mac OS X. You can find some more implementation details at the end of this write-up.
Benchmarks
----------
I've run native benchmarks to compare the compression/decompression throughput and the compression ratio at various compression levels for zlib-adler [7], zlib-chromium [8], zlib-ng [9], zlib-ipp [10], zlib-cloudflare [11] and the isa-l [12] library. The results and more details can be found in my zlib-bench GitHub repository [5].
After that I've used precompiled versions of zlib-cloudflare, zlib-chromium and the changes proposed by this enhancement to rerun these benchmarks in Java on Linux/x86_64 and Linux/aarch64. I think the results are quite impressive [4] showing around 100% higher throughput for compression and around 50% higher throughput for decompression on both architectures.
Implementation details
----------------------
Before change "8237750: Load libzip.so only if necessary" [13] libzip (and transitively libz) was loaded early at VM startup in "init_globals() -> ClassLoader::initialize()". For this change I've re-enabled this early loading if and only if one of the new system properties was given on the command line. I think these new properties will only be used if heavy compression/decompression usage is expected and in such a case it doesn't harm to load libzip early at startup. Loading libzip later and on demand would otherwise make the synchronization of the function pointer initialization unnecessary hard because libzip can be loaded independently from the VM as well as from the class library.
An alternative approach would be to use __attribute(constructor) (on Linux) or DLLMain() (on Windows) to perform the function pointer initialization once when libzip gets loaded. But first, this approach is quite system dependent and I'm not sure it is available on all platforms (I'm pretty sure it doesn't work on AIX :). Second, I think it is not possible to dynamically load DLLs from DLLMain() on Windows which would be required in this case.
A third possibility would be to define a JNI OnLoad function for libzip. But unfortunately, libzip can not only be loaded from java.util.zip where we have a JNIEnv but also from HotSpot or directly with dlopen() from libjimage.
I've also removed the following, OpenJDK-specific patch to "zconf.h" in the bundled zlib version which seems to have been there "since ever" but which I don't think is required any more:
src/java.base/share/native/libzip/zlib/zconf.h
-#ifdef _LP64
-typedef unsigned int uLong; /* 32 bits or more */
-#else
typedef unsigned long uLong; /* 32 bits or more */
-#endif
The change is needed to make the bundled zlib compatible with the system or the other zlib implementations which all do not have this change. Notice that if we're building with the system zlib, we are already using this setting today because in that case "zconf.h" is taken from the system include path.
Finally, it has to be noticed that although we are loading libzip early there are still two call sites of zlib functionality which won't be able to benefit from the new implementations because they either statically link in parts of the bundled zlib version or directly and dynamically link against the system zlib version. That's libsplashscreen and libjli. Both are only used at startup and both only make limited use of zlib (if they use it at all) so I don't think that they are relevant.
Risks
-----
Unfortunately ZipInputStream/ZipOutputStream have been designed such that it is very easy to use them in an unsupported way. It is in general not possible to read a ZipEntry from a ZipInputStream and write that exact ZipEntry into a ZipOutputStream followed by the inflated and re-deflated data of that entry. However this naive approach is sometimes used to copy zip entries from a zip input file into another zip output file. This approach will only work in the case where the exact same deflate implementation and compression was used for the initial compression like for the recompression.
This is because the ZipEntry will contain the compressed size of the data belonging to that entry. But the Deflate format doesn't mandate a specific, fixed compression algorithm which will always result in the same compressed size. Instead, different implementations are free to use different approaches for compression which can result in valid entries but potentially with a different compressed size.
Actually, even if just a single implementation is used for both compression and decompression, it is still not possible to detect the compression level from the compressed data. It is therefore possible that decompressing and re-compressing that data into a ZipOutputStream will result in a different compressed size. However, ZipOutputStream will throw an exception if the compressed size entry in a ZipEntry is different from the actual size of the compressed data.
I've recently fixed two such cases in the OpenJDK (JDK-8240333 [14], JDK-8240235 [15]) itself, but there may be other such use cases in the wild which may throw an "ZipException: invalid entry compressed size (expected 66 but got 67 bytes)". If that happens, the user can either fix his code (which is trivial and advised because of the problems explained before) or simply remove the new system properties in order to fall back to the bundled or system implementation.
[1] https://www.zlib.net/
[2] https://github.com/cloudflare/zlib
[3] https://chromium.googlesource.com/chromium/src/third_party/zlib/
[4] https://github.com/simonis/zlib-bench/blob/master/Results-ojdk.md
[5] https://github.com/simonis/zlib-bench/blob/master/Results.md
[6] https://github.com/simonis/zlib-bench/tree/master/lib
[7] https://github.com/madler/zlib
[8] https://chromium.googlesource.com/chromium/src/third_party/zlib/
[9] https://github.com/zlib-ng/zlib-ng
[10] https://software.intel.com/en-us/articles/intel-ipp-zlib-coding-functions
[11] https://github.com/cloudflare/zlib
[12] https://github.com/intel/isa-l
[13] https://bugs.openjdk.java.net/browse/JDK-8237750
[14] https://bugs.openjdk.java.net/browse/JDK-8240333
[15] https://bugs.openjdk.java.net/browse/JDK-8240235
Lately, new projects have picked up the original zlib implementation and considerably improved its deflate and inflate speed. Testing shows that compression throughput can be improved between 50 and 100% with Cloudflare's zlib version [2] and decompression throughput between 30 and 60% using the zlib implementation from the Chromium project [3].
Overview
--------
This enhancement proposes some simple changes to the JDK's zlib wrapper libzip which make it possible to choose alternative zlib implementations selectively for deflation and/or inflation at VM startup based on a system property. As an additional benefit it makes it possible to use the system zlib version even if the JDK was configured with a bundled zlib.
Already now, the JDK can be configured at build time to either use the bundled zlib version or to dynamically link against the system zlib. If linked dynamically, LD_LIBRARY_PATH or LD_PRELOAD can be used to run the JDK with an alternative zlib implementation. This approach is however not very flexible because:
- once configured with a "bundled" zlib it's not possible to use another implementation any more. Some platforms like Windows don't have a default, system provided zlib version so they are always built with "--with-zlib=bundled" and therefore can't change the implementation at runtime.
- even if configured with "--with-zlib=system", it is only possible a single alternative version as a substitute for the system zlib. But we've found (see benchmarks [4]) that there's no single zlib implementation which improves both compression and decompression equally well. It may therefore make sense to have the possibility to selectively use one implementation for compression and another one for decompression.
This enhancement proposes three new system properties:
org.openjdk.zlib.implementation
org.openjdk.zlib.implementation.inflate
org.openjdk.zlib.implementation.deflate
which can be used to either replace the full zlib functionality or selectively just the deflate or inflate part with a new implementation. As an argument they take an absolute or relative path to an API-compatible, shared zlib library. E.g.
-Dorg.openjdk.zlib.implementation=zlib-cloudflare.so
-Dorg.openjdk.zlib.implementation.deflate=/Git/zlib-bench/lib/x86_64/zlib-cloudflare.dylib
-Dorg.openjdk.zlib.implementation.inflate='D:\Git\zlib-bench\lib\x86_64\zlib-chromium.dll`
An argument with the value "system" can be used to instruct the JDK to load the default zlib version on the corresponding system. Notice that this option now also works in the case where the JDK was configured with a bundled zlib.
The naming of the properties is of course subject to discussion. If we get an agreement on this change and the naming, I'll be happy to open a CSR for the introduction of the new properties.
For testing purpose I've made some precompiled version of zlib-cloudflare and zlib-chromium for Linux/x86_64, Linux/aarch64, MacOS/x86_64 and Windows/x86_64 available in my GitHub repository [6].
The implementation of this whole enhancement is fairly simple. Instead of calling right into zlib from libzip I've replaced all direct calls by indirect calls through function pointers. By default, these function pointers are pointing either to the corresponding functions in the bundled zlib implementation or to the corresponding functions in the dynamically linked system zlib. If none of the above system properties will be used, the only change to the current implementation is the replacement of some direct calls by indirect calls. Because these function calls are usually made through JNI and because the underlying code is usually quite compute intensive, I couldn't measure any performance regression because of this change.
If one of the above system properties will be set, the selected shared library will be dynamically loaded and the corresponding function pointers will be redirected to point to the new implementation. I have verified that this works on Linux, Windows and Mac OS X. You can find some more implementation details at the end of this write-up.
Benchmarks
----------
I've run native benchmarks to compare the compression/decompression throughput and the compression ratio at various compression levels for zlib-adler [7], zlib-chromium [8], zlib-ng [9], zlib-ipp [10], zlib-cloudflare [11] and the isa-l [12] library. The results and more details can be found in my zlib-bench GitHub repository [5].
After that I've used precompiled versions of zlib-cloudflare, zlib-chromium and the changes proposed by this enhancement to rerun these benchmarks in Java on Linux/x86_64 and Linux/aarch64. I think the results are quite impressive [4] showing around 100% higher throughput for compression and around 50% higher throughput for decompression on both architectures.
Implementation details
----------------------
Before change "8237750: Load libzip.so only if necessary" [13] libzip (and transitively libz) was loaded early at VM startup in "init_globals() -> ClassLoader::initialize()". For this change I've re-enabled this early loading if and only if one of the new system properties was given on the command line. I think these new properties will only be used if heavy compression/decompression usage is expected and in such a case it doesn't harm to load libzip early at startup. Loading libzip later and on demand would otherwise make the synchronization of the function pointer initialization unnecessary hard because libzip can be loaded independently from the VM as well as from the class library.
An alternative approach would be to use __attribute(constructor) (on Linux) or DLLMain() (on Windows) to perform the function pointer initialization once when libzip gets loaded. But first, this approach is quite system dependent and I'm not sure it is available on all platforms (I'm pretty sure it doesn't work on AIX :). Second, I think it is not possible to dynamically load DLLs from DLLMain() on Windows which would be required in this case.
A third possibility would be to define a JNI OnLoad function for libzip. But unfortunately, libzip can not only be loaded from java.util.zip where we have a JNIEnv but also from HotSpot or directly with dlopen() from libjimage.
I've also removed the following, OpenJDK-specific patch to "zconf.h" in the bundled zlib version which seems to have been there "since ever" but which I don't think is required any more:
src/java.base/share/native/libzip/zlib/zconf.h
-#ifdef _LP64
-typedef unsigned int uLong; /* 32 bits or more */
-#else
typedef unsigned long uLong; /* 32 bits or more */
-#endif
The change is needed to make the bundled zlib compatible with the system or the other zlib implementations which all do not have this change. Notice that if we're building with the system zlib, we are already using this setting today because in that case "zconf.h" is taken from the system include path.
Finally, it has to be noticed that although we are loading libzip early there are still two call sites of zlib functionality which won't be able to benefit from the new implementations because they either statically link in parts of the bundled zlib version or directly and dynamically link against the system zlib version. That's libsplashscreen and libjli. Both are only used at startup and both only make limited use of zlib (if they use it at all) so I don't think that they are relevant.
Risks
-----
Unfortunately ZipInputStream/ZipOutputStream have been designed such that it is very easy to use them in an unsupported way. It is in general not possible to read a ZipEntry from a ZipInputStream and write that exact ZipEntry into a ZipOutputStream followed by the inflated and re-deflated data of that entry. However this naive approach is sometimes used to copy zip entries from a zip input file into another zip output file. This approach will only work in the case where the exact same deflate implementation and compression was used for the initial compression like for the recompression.
This is because the ZipEntry will contain the compressed size of the data belonging to that entry. But the Deflate format doesn't mandate a specific, fixed compression algorithm which will always result in the same compressed size. Instead, different implementations are free to use different approaches for compression which can result in valid entries but potentially with a different compressed size.
Actually, even if just a single implementation is used for both compression and decompression, it is still not possible to detect the compression level from the compressed data. It is therefore possible that decompressing and re-compressing that data into a ZipOutputStream will result in a different compressed size. However, ZipOutputStream will throw an exception if the compressed size entry in a ZipEntry is different from the actual size of the compressed data.
I've recently fixed two such cases in the OpenJDK (
[1] https://www.zlib.net/
[2] https://github.com/cloudflare/zlib
[3] https://chromium.googlesource.com/chromium/src/third_party/zlib/
[4] https://github.com/simonis/zlib-bench/blob/master/Results-ojdk.md
[5] https://github.com/simonis/zlib-bench/blob/master/Results.md
[6] https://github.com/simonis/zlib-bench/tree/master/lib
[7] https://github.com/madler/zlib
[8] https://chromium.googlesource.com/chromium/src/third_party/zlib/
[9] https://github.com/zlib-ng/zlib-ng
[10] https://software.intel.com/en-us/articles/intel-ipp-zlib-coding-functions
[11] https://github.com/cloudflare/zlib
[12] https://github.com/intel/isa-l
[13] https://bugs.openjdk.java.net/browse/JDK-8237750
[14] https://bugs.openjdk.java.net/browse/JDK-8240333
[15] https://bugs.openjdk.java.net/browse/JDK-8240235