Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8320602

Lock contention in SchemaDVFactory.getInstance()

XMLWordPrintable

        ADDITIONAL SYSTEM INFORMATION :
        OpenJDK Runtime Environment (build 21+35-2513)
        OpenJDK 64-Bit Server VM (build 21+35-2513, mixed mode, sharing)

        Observed under Linux (Ubuntu 20.04 LTS), can also be reproduced on other platforms.

        A DESCRIPTION OF THE PROBLEM :
        We observed lock contention on the synchronized method SchemaDVFactory.getInstance() when parsing/validating many XML InputStreams with parallel threads using JAXB unmarshallers with XML Schema validation.

        In our use case, we were not able to parse/validate more than ~3,000 files per second, although the CPU load of a Xeon(R) W-2195 CPU (18 Cores+HT) was only about 30%. Performance even degraded when increasing the number of threads used.

        Profiling with VisualVM and YourKit clearly indicated lock congestion on the synchronized method SchemaDVFactory.getInstance().

        We observed that the problem does *not* occur when using the external Xerces2 library for XML Schema validation. Note that in Xerces2, the synchronized keyword (also for getInstance(String factoryClass)) were removed in 2007 with a commit message indicating they observed similar multithreading issues as we did:
        https://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/dv/SchemaDVFactory.java?revision=558582&view=markup

        STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
        1. Create Schema instance from XSD using SchemaFactory (in our case, the schema is quite complex)
        2. Create (single) JAXBContext
        3. Repeatedly start threads to create a multithreaded CPU load. In each thread, create an unmarshaller from the JAXBContext using JAXBContext.createUnmarshaller(), enable schema validation with Unmarshaller.setSchema() and parse an Input Stream (in our use case, ~20..100kb XML)

        EXPECTED VERSUS ACTUAL BEHAVIOR :
        EXPECTED -
        CPU load should reach ~100% when creating enough threads. Profilers should show that threads are either working or sleeping/parked.
        ACTUAL -
        CPU load does not reach ~100%, not even 50%, no matter how many threads are running. Profilers show that threads spend a lot of time waiting for locks.

        CUSTOMER SUBMITTED WORKAROUND :
        see solution in Xerces2, as they were able to remove the synchronized keyword without losing functionality

        FREQUENCY : always


              joehw Joe Wang
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

                Created:
                Updated:
                Resolved: