Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8169931

8k class metaspace chunks misallocated from 4k chunk freelist

XMLWordPrintable

    • gc
    • b150
    • x86_64
    • linux
    • Fix failed

      FULL PRODUCT VERSION :
      java version "1.8.0_102"
      Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
      Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)

      Reproduced with a default clone of OpenJDK jdk8u

      FULL OS VERSION :
      Linux pm-cluster-rhel7-1b 3.10.0-327.28.3.el7.x86_64 #1 SMP Fri Aug 12 13:21:05 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
      (can be reproduced on any 64-bit Linux flavour)


      A DESCRIPTION OF THE PROBLEM :
      We have an application server that generates code when applications are deployed. Recently it started failing during the deploy of a large application with "java.lang.OutOfMemoryError: Metaspace" errors. Experimenting with the command-line metaspace configuration flags made no difference. The only thing that did work was to disable CMS entirely, but this is not a practical long-term solution.

      To try to determine the root cause of the issue, it was investigated using a clone of the OpenJDK jdk8u Mercurial repository. Local builds of the JDK with extra debug logging were made. Eventually the bug was tracked down to an implementation error in hotspot/src/share/vm/memory/metaspace.cpp. The ChunkManager::list_index() method returns the wrong answer for humongous class metadata chunks if the chunk size happens to be the same size as a non-class metadata medium chunk (8K).

      Chunk sizes are specified as so (from metaspace.cpp):

       77 enum ChunkSizes { // in words.
       78 ClassSpecializedChunk = 128,
       79 SpecializedChunk = 128,
       80 ClassSmallChunk = 256,
       81 SmallChunk = 512,
       82 ClassMediumChunk = 4 * K,
       83 MediumChunk = 8 * K
       84 };

      list_index() is a static method that returns the index of an appropriate freelist:

      2330 ChunkIndex ChunkManager::list_index(size_t size) {
      2331 switch (size) {
      2332 case SpecializedChunk:
      2333 assert(SpecializedChunk == ClassSpecializedChunk,
      2334 "Need branch for ClassSpecializedChunk");
      2335 return SpecializedIndex;
      2336 case SmallChunk:
      2337 case ClassSmallChunk:
      2338 return SmallIndex;
      2339 case MediumChunk:
      2340 case ClassMediumChunk:
      2341 return MediumIndex;
      2342 default:
      2343 assert(size > MediumChunk || size > ClassMediumChunk,
      2344 "Not a humongous chunk");
      2345 return HumongousIndex;
      2346 }
      2347 }

      It's obvious looking at the code that if an 8K class metadata chunk is requested, this method is going to erroneously claim that it's a medium chunk not a humongous chunk. This leads to 4K chunks being allocated from medium chunk freelist, if any are available there, which aren't big enough to hold the 8K of data needed. Consequently, the allocation fails, is retried a couple of times, causes GC to be initiated, the allocation is subsequently tried again, but fails for the same reason, eventually causing the java.lang.OutOfMemoryError.

      The error *only* occurs when there are free chunks available on the medium chunk freelist. If there aren't any there, new chunks *of the correct size* are allocated from virtual memory space and all is well.

      THE PROBLEM WAS REPRODUCIBLE WITH -Xint FLAG: Did not try

      THE PROBLEM WAS REPRODUCIBLE WITH -server FLAG: Yes

      REGRESSION. Last worked in version 7u80

      STEPS TO FOLLOW TO REPRODUCE THE PROBLEM :
      Load a class requiring an 8K class metadata chunk when there are 4K chunks available on the medium chunk freelist.


      EXPECTED VERSUS ACTUAL BEHAVIOR :
      Expected: The class should load successfully

      Actual: A java.lang.OutOfMemoryError: Metaspace error occurs
      ERROR MESSAGES/STACK TRACES THAT OCCUR :
      Metaspace debug log showing a (failed) request for 8061 words being "satisfied" using a 4096 word chunk:

      SpaceManager::grow_and_allocate for 8061 words 2627 words used 1469 words left
      Metadata humongous allocation:
        word_size 0x0000000000001f7d
        chunk_word_size 0x0000000000002000
          chunk overhead 0x0000000000000005
      ChunkManager::free_chunks_get: free_list 0x00007f57c00a3fc0 head 0x0000000104729c00 size 4096
      ChunkManager::chunk_freelist_allocate: 0x00007f57c00a3f80 chunk 0x0000000104729c00 size 4096 count 292 Free chunk total 1285504 count 609
      SpaceManager::add_chunk: 8) Metachunk: bottom 0x0000000104729c00 top 0x0000000104729c28 end 0x0000000104731c00 size 4096
          used 5 free 4091


      REPRODUCIBILITY :
      This bug can be reproduced often.

      ---------- BEGIN SOURCE ----------
      Once the issue was understood an attempt was made to create a standalone test case that could reproduce it, but that effort has so far failed.
      ---------- END SOURCE ----------

      CUSTOMER SUBMITTED WORKAROUND :
      Disabling CMS GC is the only known effective workaround.

      A patch against the OpenJDK that fixes the issue has been written, but it's too big to fit here.

            stefank Stefan Karlsson
            webbuggrp Webbug Group
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: