Uploaded image for project: 'JDK'
  1. JDK
  2. JDK-5054916

HPROF: potential performance improvements and other misc ideas

XMLWordPrintable

    • Icon: Enhancement Enhancement
    • Resolution: Won't Fix
    • Icon: P5 P5
    • None
    • 5.0
    • tools
    • 1.5
    • 5.0
    • generic
    • generic

      This is mostly a reminder rfe.

      The hprof in v1.5.0 was converted to use JVMTI, experience over time has
      caused me to re-think some of the design of this code. These are just a
      list of items that I had. Documented here so they are not lost.

      * Memory allocation. It uses malloc(), and worse realloc() but I'm
        concerned that this causes threads to block unnecessarily.
         One idea was to change the tables so that all memory allocated is never
         moved, e.g. don't use realloc. Allocate pages of table entries and based
         on the table index, find the page. A lock might be needed on the page lookup
         but not access to the table element. That would be a quick monitor enter/exit
         and the actual pointer could even be used as the Tag on objects or the
         ThreadLocalStorage pointer, avoiding a table lookup for the TlsIndex but
         more importantly, avoiding the monitor contention.
         Ideal is no monitors held during these HOT BCI Object.<init> events.
         
         Might be nice to have a per-thread memory allocator for some things, or use
         something like alloca() for allocations that have short durations.
         Dumping class instances and getAllFields() do a lot of allocations that
         could be just alloca() type allocations.
         This applies to signature_to_name() in hprof_io.c in particular.

         If we had a native library where we could create a separate heap area
         for each thread: heap = new_heap("Thread 1", initial_size);
         and allocate with ptr = heap_malloc(heap, nbytes);
         That sure could help matters... See separate RFE on this.
         
        * Need faster way to get from jclass to ClassIndex, need to avoid
          GetClassSignature and required Deallocate if we can. Perhaps a lookup
          of a ClassIndex via the jclass Object hashCode? Separate lookup tables
          in hprof_table.c would be a nice addition to this table functionality.

        * GetStackTrace appears to be expensive. Could it not be called all the time?
          Could we use the cpu=times mechanism to get the Stack for hprof=* instead
          of asking JVMTI? How big an issue is GetStackTrace performance?

        * Table walk needs a short circuit mechanism. Callbacks should have function
          returns that tell table walker to abort or continue walk.
          The places where table walks are used for searches (TLS) would benefit,
          along with the loader table.

        * Taking a huge step backward, perhaps the Tracker class shouldn't call
          native code at all, but just buffer events, then agent could call another
          Java method in the class when it wanted to unload the event buffer.
          The java code could block when the buffer filled or something like that,
          but perhaps the buffer is provided by the agent library so that this
          memory doesn't skew the user app stats?
          This could be a huge experiment, someone would need to allow lost of time
          to play with this. The goal of course would be to get the BCI code more
          into a 100% Java world so that the JVM can optimize it better.

        * A better logging mechanism would be nice. Perhaps a separate logging
          library that both JDWP backend and hprof and other native code in the JVM
          could use, maybe even the JVM too. Creating a single native logging file?
          Maybe even using the java logging settings?

      ###@###.### 2004-05-28

            ohair Kelly Ohair (Inactive)
            ohair Kelly Ohair (Inactive)
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

              Created:
              Updated:
              Resolved:
              Imported:
              Indexed: