Startup profiling shows that we're spending quite a bit of time in Ticks::now during class load, but it appears as if this time is only used if the EventClassLoad is to be committed.
Extracting a pre_class_load_event method to only take the Ticks::now if the event is enabled should be recorded has a measurable effect on a minimal program:
Baseline run:
chrt -f 99 perf stat -r 250 java Hello > /dev/null
Performance counter stats for 'java Hello' (250 runs):
104,616782 task-clock (msec) # 1,230 CPUs utilized ( +- 0,26% )
847 context-switches # 0,008 M/sec ( +- 0,17% )
54 cpu-migrations # 0,516 K/sec ( +- 1,65% )
3 160 page-faults # 0,030 M/sec ( +- 0,37% )
377 267 792 cycles # 3,606 GHz ( +- 0,25% ) (80,38%)
238 128 005 stalled-cycles-frontend # 63,12% frontend cycles idle ( +- 0,39% ) (85,54%)
184 874 177 stalled-cycles-backend # 49,00% backend cycles idle ( +- 0,38% ) (69,21%)
301 374 735 instructions # 0,80 insn per cycle
# 0,79 stalled cycles per insn ( +- 0,12% ) (84,57%)
58 648 242 branches # 560,601 M/sec ( +- 0,08% ) (82,65%)
2 198 462 branch-misses # 3,75% of all branches ( +- 0,19% ) (83,06%)
0,085036213 seconds time elapsed ( +- 0,22% )
Patched:
chrt -f 99 perf stat -r 250 java Hello > /dev/null
Performance counter stats for 'java Hello' (250 runs):
100,586215 task-clock (msec) # 1,231 CPUs utilized ( +- 0,24% )
827 context-switches # 0,008 M/sec ( +- 0,14% )
60 cpu-migrations # 0,601 K/sec ( +- 1,87% )
3 173 page-faults # 0,032 M/sec ( +- 0,36% )
364 094 344 cycles # 3,620 GHz ( +- 0,24% ) (81,98%)
225 931 914 stalled-cycles-frontend # 62,05% frontend cycles idle ( +- 0,39% ) (86,06%)
174 214 186 stalled-cycles-backend # 47,85% backend cycles idle ( +- 0,39% ) (65,68%)
298 467 000 instructions # 0,82 insn per cycle
# 0,76 stalled cycles per insn ( +- 0,17% ) (83,39%)
58 031 033 branches # 576,928 M/sec ( +- 0,09% ) (83,07%)
2 193 340 branch-misses # 3,78% of all branches ( +- 0,19% ) (84,00%)
0,081682242 seconds time elapsed ( +- 0,19% )
Extracting a pre_class_load_event method to only take the Ticks::now if the event is enabled should be recorded has a measurable effect on a minimal program:
Baseline run:
chrt -f 99 perf stat -r 250 java Hello > /dev/null
Performance counter stats for 'java Hello' (250 runs):
104,616782 task-clock (msec) # 1,230 CPUs utilized ( +- 0,26% )
847 context-switches # 0,008 M/sec ( +- 0,17% )
54 cpu-migrations # 0,516 K/sec ( +- 1,65% )
3 160 page-faults # 0,030 M/sec ( +- 0,37% )
377 267 792 cycles # 3,606 GHz ( +- 0,25% ) (80,38%)
238 128 005 stalled-cycles-frontend # 63,12% frontend cycles idle ( +- 0,39% ) (85,54%)
184 874 177 stalled-cycles-backend # 49,00% backend cycles idle ( +- 0,38% ) (69,21%)
301 374 735 instructions # 0,80 insn per cycle
# 0,79 stalled cycles per insn ( +- 0,12% ) (84,57%)
58 648 242 branches # 560,601 M/sec ( +- 0,08% ) (82,65%)
2 198 462 branch-misses # 3,75% of all branches ( +- 0,19% ) (83,06%)
0,085036213 seconds time elapsed ( +- 0,22% )
Patched:
chrt -f 99 perf stat -r 250 java Hello > /dev/null
Performance counter stats for 'java Hello' (250 runs):
100,586215 task-clock (msec) # 1,231 CPUs utilized ( +- 0,24% )
827 context-switches # 0,008 M/sec ( +- 0,14% )
60 cpu-migrations # 0,601 K/sec ( +- 1,87% )
3 173 page-faults # 0,032 M/sec ( +- 0,36% )
364 094 344 cycles # 3,620 GHz ( +- 0,24% ) (81,98%)
225 931 914 stalled-cycles-frontend # 62,05% frontend cycles idle ( +- 0,39% ) (86,06%)
174 214 186 stalled-cycles-backend # 47,85% backend cycles idle ( +- 0,39% ) (65,68%)
298 467 000 instructions # 0,82 insn per cycle
# 0,76 stalled cycles per insn ( +- 0,17% ) (83,39%)
58 031 033 branches # 576,928 M/sec ( +- 0,09% ) (83,07%)
2 193 340 branch-misses # 3,78% of all branches ( +- 0,19% ) (84,00%)
0,081682242 seconds time elapsed ( +- 0,19% )