Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8320602

Lock contention in SchemaDVFactory.getInstance()

    XMLWordPrintable

Details

    Description

      ADDITIONAL SYSTEM INFORMATION :
      OpenJDK Runtime Environment (build 21+35-2513)
      OpenJDK 64-Bit Server VM (build 21+35-2513, mixed mode, sharing)

      Observed under Linux (Ubuntu 20.04 LTS), can also be reproduced on other platforms.

      A DESCRIPTION OF THE PROBLEM :
      We observed lock contention on the synchronized method SchemaDVFactory.getInstance() when parsing/validating many XML InputStreams with parallel threads using JAXB unmarshallers with XML Schema validation.

      In our use case, we were not able to parse/validate more than ~3,000 files per second, although the CPU load of a Xeon(R) W-2195 CPU (18 Cores+HT) was only about 30%. Performance even degraded when increasing the number of threads used.

      Profiling with VisualVM and YourKit clearly indicated lock congestion on the synchronized method SchemaDVFactory.getInstance().

      We observed that the problem does *not* occur when using the external Xerces2 library for XML Schema validation. Note that in Xerces2, the synchronized keyword (also for getInstance(String factoryClass)) were removed in 2007 with a commit message indicating they observed similar multithreading issues as we did:
      https://svn.apache.org/viewvc/xerces/java/trunk/src/org/apache/xerces/impl/dv/SchemaDVFactory.java?revision=558582&view=markup

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      1. Create Schema instance from XSD using SchemaFactory (in our case, the schema is quite complex)
      2. Create (single) JAXBContext
      3. Repeatedly start threads to create a multithreaded CPU load. In each thread, create an unmarshaller from the JAXBContext using JAXBContext.createUnmarshaller(), enable schema validation with Unmarshaller.setSchema() and parse an Input Stream (in our use case, ~20..100kb XML)

      EXPECTED VERSUS ACTUAL BEHAVIOR :
      EXPECTED -
      CPU load should reach ~100% when creating enough threads. Profilers should show that threads are either working or sleeping/parked.
      ACTUAL -
      CPU load does not reach ~100%, not even 50%, no matter how many threads are running. Profilers show that threads spend a lot of time waiting for locks.

      CUSTOMER SUBMITTED WORKAROUND :
      see solution in Xerces2, as they were able to remove the synchronized keyword without losing functionality

      FREQUENCY : always


      Attachments

        Issue Links

          Activity

            People

              joehw Joe Wang
              webbuggrp Webbug Group
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: