Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-8294316

SA core file support is broken on macosx-x64 starting with macOS 12.x

    XMLWordPrintable

Details

    • Bug
    • Resolution: Fixed
    • P4
    • 22
    • 20
    • hotspot
    • b04
    • x86_64
    • os_x

    Description

      It appears that SA no longer works with core files on macosx-x64, I believe starting with macOS 12.x. macosx-aarch64 seems to be fine, as are earlier versions of macosx-x64. The failure that happens with all the SA core file tests in test/hotspot/jtreg/serviceability/sa is:

      ERROR: failed to workaround classshareing
      Unable to open core file

      I added some debugging code to SA's init_classsharing_workaround(), and it indicated that the cause was related to the fetching of the value of SharedArchivePath from the core file. This is suppose to point to a cstring containing the classes.jsa path, but instead seemed to contain garbage. I modified hotspot to print out &SharedArchivePath, SharedArchivePath, and the cstring it points to:

      log_info(cds)("Got default archive path: %p %p %s", &SharedArchivePath, SharedArchivePath, SharedArchivePath);

      When SA fails to open the core file, I see:

      [0.003s][info][cds] Got default archive path: 0x10faccb30 0x6000008b8010 /System/Volumes/Data/mesos/work_dir/jib-master/install/2022-09-22-2232312.chris.plummer.jdk/macosx-x64-debug.jdk/jdk-20/fastdebug/lib/server/classes.jsa

      This all looks fine. However, SA looks up the "SharedArchivePath" symbol to get its address, so in turn it can get its value, which then points to the classes.jsa path. So I also modified SA to print out this info:

            printf("sharedArchivePathAddrAddr(%p)\n", (void*)sharedArchivePathAddrAddr);
            printf("sharedArchivePathAddr (%p)\n", (void*)sharedArchivePathAddr);

      In the passing test cases it would match up with the CDS log output above. When it fails you get something different:

      Opening core file, please wait...
      hsdb>
      sharedArchivePathAddrAddr(0x10f881b30)
      sharedArchivePathAddr (0x7364616572687420)

      sharedArchivePathAddrAddr should match the hotspot &SharedArchivePath output, but it doesn't. SA is doing a symbol table lookup to get this value, so there appears to be a bug in SA's mach-o symbol table handling code.

      This problem has gone unnoticed because we have problem listed all core file testing on macoxx-x64 for probably a year now due to occasional issues with timeouts (slow core dumps). This issue seems to only be happening on 12.3.1, 12.4 and 12.5.1 host, and happens every time on these hosts, so likely the issue was introduced with macOS 12.

      I'm not seeing this on macos-aarch64, although on occasion I was seeing the same "ERROR: failed to workaround classshareing" failure message. However, I believe it was for a different reason. From what I could tell with some debugging I did with lldb, it looked like the memory where SharedArchivePath pointed to was not in the core file. However, for some reason I can't reproduce this anymore. It could be related to JDK-8293563, which is caused by the java heap not being in the core file. Possibly sometimes other areas of memory are also missing.

      Note if you try using -Xshare:off, you still see this same issue with SharedArchivePath, even though SA should not need to access it. This is because SA first accesses UseSharedSpaces to see if it is 0 or 1. It should be 0, but due to the same issue we see with SharedArchivePath (symbol lookup not working properly), UseSharedSpaces could contain anything, and usually it is not 0. To work around this I forced SA to just quickly exit init_classsharing_workaround() no matter what UseSharedSpaces is set to. This caused SA to instead fail at a later point during intialization when trying to lookup some hotspot types. It does so through vmstructs, which SA accesses via other global symbols that it appears SA is not looking up properly. So it appears that in general SA's symbol table lookups are broken with core files on 12.x, and it is not just just some global symbosl.

      Attachments

        Issue Links

          Activity

            People

              never Tom Rodriguez
              cjplummer Chris Plummer
              Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: