VM_ThreadDump performance - 24% of CPU cycle spent on RegisterMap copy

XMLWordPrintable

    • Type: Sub-task
    • Resolution: Unresolved
    • Priority: P4
    • tbd
    • Affects Version/s: None
    • Component/s: hotspot

      On a sample application's profiler using VM_ThreadDump, 24.52% (of VM thread) CPU cycle was spent on copying RegisterMap.
      https://github.com/openjdk/jdk/blob/5dc9723c8172e288872f744bac5fd2342475767a/src/hotspot/share/runtime/vframe.cpp#L97

      RegisterMap is huge as well.

      /* size: 4984, cachelines: 78, members: 10 /
          / sum members: 4976, holes: 1, sum holes: 7 /
          / padding: 1 /
          / paddings: 1, sum paddings: 1 /
          / last cacheline: 56 bytes */

      Perf report attached.

      Pass by value seems to perform better and tier 1 / tier 2 tests passed. However, unclear if there is impact to other VM operations.

            Assignee:
            Neethu Prasad
            Reporter:
            Neethu Prasad
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

              Created:
              Updated: