Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-4774903

Symantec JIT is causing deadlocks on certain Pentium systems




      The Symantec JIT is causing server application to hang on the certain Intel
      servers. The workaround is to disable the JIT and have the JVM run
      in interpret mode which causes the customer app to run slower. The customer
      ship with JRE 1.2.2_005 and have test with JRE1.2.2_014 and the problem
      still exists. symcjit.dll version is contained in both versions.

      Customer Problem Description:
      While benchmarking the our environment, we observed an unusual amount of
      LOCK# signal assertions on the Pentium bus. This signal literally locks
      the entire memory for its duration. It is usually caused by atomic
      operations that cross cache line boundaries. This is a legacy x86 stuff
      that most processor architectures won't support. If it happens
      frequently, performance of the machine will be degraded; In essence, it
      forces the CPU to go out to the bus, and then turns SMP in to a UP by
      locking all the remaining processors and the I/O bridges out of the bus.

      Further analysis showed that the JIT compiler symcjit.dll in the SUN java
      distribution was responsible. This dll was executing a Pentium XCHG
      instruction operating on misaligned data. (addresses other than an integer
      multiple of 4 bytes). The operative keyword here is 'misaligned data'.
      The XCHG instruction by itself does not cause this. However if the XCHG
      operand is on a misaligned boundary, then the processor must assert the
      LOCK# signal to ensure atomic access. Note that this is a performance
      issue even on a UP because the signal must be asserted regardless...

      It is very unusual to have misaligned data and locked access combination.
      Most processor architectures won't support bus locking nor misaligned
      access. And most compilers won't place data on misaligned boundaries. So
      what goes? I want to speculate here that this xchg instruction is hand
      coded by inlined assembly and its operand is cast from a smaller data type
      therefore causing it to be misaligned, or perhaps a bug.

      Fixing this problem should be very easy. In symcjit.dll, ensure that the
      operands of the XCHG instruction in the function 500b8410 are 4 byte

      / This is the function (atomic exchange) that causes LOCK# signal
      assertions due to misaligned data

      kd> u 500b8410
      500b8410 8b542404 mov edx,[esp+0x4]
      500b8414 8b442408 mov eax,[esp+0x8]
      500b8418 8702 xchg [edx],eax // the instruction
      issuing the locked bus cycles
      500b841a c3 ret
      500b841b 0500000000 add eax,0x0
      500b8420 50 push eax
      500b8421 51 push ecx
      500b8422 8b08 mov ecx,[eax]

      // This is the code that calls the atomic exchange.

      kd> u 5009cbe0 [Local 6:40 PM]
      5009cbe0 54 push esp
      5009cbe1 2434 and al,0x34
      5009cbe3 6808270c50 push 0x500c2708
      5009cbe8 52 push edx
      5009cbe9 8944243c mov [esp+0x3c],eax
      5009cbed 894c2440 mov [esp+0x40],ecx
      5009cbf1 e80a8e0100 call 500b5a00
      5009cbf6 57 push edi
      5009cbf7 55 push ebp
      5009cbf8 e813b80100 call 500b8410 // Call to atomic
      exchange function.
      5009cbfd 8a460c mov al,[esi+0xc]
      5009cc00 83c408 add esp,0x8
      5009cc03 a802 test al,0x2
      5009cc05 750a jnz 5009cc11
      5009cc07 8b4c2410 mov ecx,[esp+0x10]
      5009cc0b 56 push esi

      And here are the address maps that will help debug
      (See attached file: d1584.txt)(See attached file: symcjit.txt)




            arorcl Anupam R (Inactive)
            atongschsunw Albert Tong-schmidt (Inactive)
            0 Vote for this issue
            0 Start watching this issue