-
Enhancement
-
Resolution: Fixed
-
P2
-
7u6
-
Mac OS X (this is all I tested, but performance improvement should be similar on Windows)
I wanted to compare the performance of Java code versus C code for the FX Pisces Renderer class (in particular, the inner class ScanlineIterator). To do this, I developed a version of ScanlineIterator (called ScanlineIterator2) that would compile under both C and Java and changed Renderer to use this class. This allowed me to very quickly get working Java and C code for the comparison.
The win is huge. Here are the results of VectorTest running full speed (the second run has -DNATIVE_ALPHA to enable the C code, both print the total time in the method every 1000 times it is called):
**** SLOW:
chart width: 1200.0
chart height: 600.0
NATIVE_ALPHA: false
Count=1, Time=39
Count=101, Time=321
Count=201, Time=594
Count=301, Time=867
Count=401, Time=1148
FPS: 22.60
Count=501, Time=1418
Count=601, Time=1696
Count=701, Time=1979
Count=801, Time=2263
Count=901, Time=2549
Count=1001, Time=2832
FPS: 28.77
Count=1101, Time=3113
Count=1201, Time=3394
Count=1301, Time=3676
Count=1401, Time=3962
Count=1501, Time=4242
Count=1601, Time=4523
FPS: 28.29
Count=1701, Time=4811
Count=1801, Time=5092
Count=1901, Time=5377
Count=2001, Time=5662
Count=2101, Time=5945
FPS: 28.57
Count=2201, Time=6230
Count=2301, Time=6520
Count=2401, Time=6804
Count=2501, Time=7085
Count=2601, Time=7365
Count=2701, Time=7632
FPS: 28.45
Count=2801, Time=7902
Count=2901, Time=8189
Count=3001, Time=8495
Count=3101, Time=8788
Count=3201, Time=9081
Count=3301, Time=9375
FPS: 27.63
Count=3401, Time=9668
Count=3501, Time=9966
Count=3601, Time=10263
Count=3701, Time=10569
Count=3801, Time=10866
Count=3901, Time=11161
FPS: 28.28
Count=4001, Time=11454
Count=4101, Time=11743
Count=4201, Time=12034
Count=4301, Time=12335
Count=4401, Time=12640
FPS: 28.43
Count=4501, Time=12933
Count=4601, Time=13225
Count=4701, Time=13523
Count=4801, Time=13820
Count=4901, Time=14120
Count=5001, Time=14422
FPS: 28.66
Count=5101, Time=14722
Count=5201, Time=15021
Count=5301, Time=15317
Count=5401, Time=15618
Count=5501, Time=15911
Count=5601, Time=16206
FPS: 28.89
Count=5701, Time=16507
Count=5801, Time=16789
Count=5901, Time=17087
Count=6001, Time=17380
Count=6101, Time=17676
FPS: 29.28
Count=6201, Time=17974
Count=6301, Time=18261
Count=6401, Time=18559
Count=6501, Time=18862
Count=6601, Time=19158
***** FAST:
chart width: 1200.0
chart height: 600.0
NATIVE_ALPHA: true
Count=1, Time=3
Count=101, Time=203
Count=201, Time=400
Count=301, Time=600
Count=401, Time=795
Count=501, Time=995
Count=601, Time=1192
FPS: 30.89
Count=701, Time=1392
Count=801, Time=1590
Count=901, Time=1782
Count=1001, Time=1980
Count=1101, Time=2181
Count=1201, Time=2379
Count=1301, Time=2577
Count=1401, Time=2774
FPS: 38.40
Count=1501, Time=2974
Count=1601, Time=3167
Count=1701, Time=3361
Count=1801, Time=3557
Count=1901, Time=3745
Count=2001, Time=3950
Count=2101, Time=4139
Count=2201, Time=4336
FPS: 39.40
Count=2301, Time=4529
Count=2401, Time=4725
Count=2501, Time=4916
Count=2601, Time=5108
Count=2701, Time=5304
Count=2801, Time=5502
Count=2901, Time=5700
Count=3001, Time=5889
FPS: 40.32
Count=3101, Time=6087
Count=3201, Time=6283
Count=3301, Time=6479
Count=3401, Time=6675
Count=3501, Time=6873
Count=3601, Time=7076
Count=3701, Time=7273
Count=3801, Time=7466
FPS: 40.36
Count=3901, Time=7660
Count=4001, Time=7858
Count=4101, Time=8057
Count=4201, Time=8264
Count=4301, Time=8460
Count=4401, Time=8668
Count=4501, Time=8860
Count=4601, Time=9055
FPS: 40.20
Count=4701, Time=9270
Count=4801, Time=9463
Count=4901, Time=9666
Count=5001, Time=9866
Count=5101, Time=10062
Count=5201, Time=10266
Count=5301, Time=10467
Count=5401, Time=10659
FPS: 39.89
Count=5501, Time=10857
Count=5601, Time=11060
The win is huge. Here are the results of VectorTest running full speed (the second run has -DNATIVE_ALPHA to enable the C code, both print the total time in the method every 1000 times it is called):
**** SLOW:
chart width: 1200.0
chart height: 600.0
NATIVE_ALPHA: false
Count=1, Time=39
Count=101, Time=321
Count=201, Time=594
Count=301, Time=867
Count=401, Time=1148
FPS: 22.60
Count=501, Time=1418
Count=601, Time=1696
Count=701, Time=1979
Count=801, Time=2263
Count=901, Time=2549
Count=1001, Time=2832
FPS: 28.77
Count=1101, Time=3113
Count=1201, Time=3394
Count=1301, Time=3676
Count=1401, Time=3962
Count=1501, Time=4242
Count=1601, Time=4523
FPS: 28.29
Count=1701, Time=4811
Count=1801, Time=5092
Count=1901, Time=5377
Count=2001, Time=5662
Count=2101, Time=5945
FPS: 28.57
Count=2201, Time=6230
Count=2301, Time=6520
Count=2401, Time=6804
Count=2501, Time=7085
Count=2601, Time=7365
Count=2701, Time=7632
FPS: 28.45
Count=2801, Time=7902
Count=2901, Time=8189
Count=3001, Time=8495
Count=3101, Time=8788
Count=3201, Time=9081
Count=3301, Time=9375
FPS: 27.63
Count=3401, Time=9668
Count=3501, Time=9966
Count=3601, Time=10263
Count=3701, Time=10569
Count=3801, Time=10866
Count=3901, Time=11161
FPS: 28.28
Count=4001, Time=11454
Count=4101, Time=11743
Count=4201, Time=12034
Count=4301, Time=12335
Count=4401, Time=12640
FPS: 28.43
Count=4501, Time=12933
Count=4601, Time=13225
Count=4701, Time=13523
Count=4801, Time=13820
Count=4901, Time=14120
Count=5001, Time=14422
FPS: 28.66
Count=5101, Time=14722
Count=5201, Time=15021
Count=5301, Time=15317
Count=5401, Time=15618
Count=5501, Time=15911
Count=5601, Time=16206
FPS: 28.89
Count=5701, Time=16507
Count=5801, Time=16789
Count=5901, Time=17087
Count=6001, Time=17380
Count=6101, Time=17676
FPS: 29.28
Count=6201, Time=17974
Count=6301, Time=18261
Count=6401, Time=18559
Count=6501, Time=18862
Count=6601, Time=19158
***** FAST:
chart width: 1200.0
chart height: 600.0
NATIVE_ALPHA: true
Count=1, Time=3
Count=101, Time=203
Count=201, Time=400
Count=301, Time=600
Count=401, Time=795
Count=501, Time=995
Count=601, Time=1192
FPS: 30.89
Count=701, Time=1392
Count=801, Time=1590
Count=901, Time=1782
Count=1001, Time=1980
Count=1101, Time=2181
Count=1201, Time=2379
Count=1301, Time=2577
Count=1401, Time=2774
FPS: 38.40
Count=1501, Time=2974
Count=1601, Time=3167
Count=1701, Time=3361
Count=1801, Time=3557
Count=1901, Time=3745
Count=2001, Time=3950
Count=2101, Time=4139
Count=2201, Time=4336
FPS: 39.40
Count=2301, Time=4529
Count=2401, Time=4725
Count=2501, Time=4916
Count=2601, Time=5108
Count=2701, Time=5304
Count=2801, Time=5502
Count=2901, Time=5700
Count=3001, Time=5889
FPS: 40.32
Count=3101, Time=6087
Count=3201, Time=6283
Count=3301, Time=6479
Count=3401, Time=6675
Count=3501, Time=6873
Count=3601, Time=7076
Count=3701, Time=7273
Count=3801, Time=7466
FPS: 40.36
Count=3901, Time=7660
Count=4001, Time=7858
Count=4101, Time=8057
Count=4201, Time=8264
Count=4301, Time=8460
Count=4401, Time=8668
Count=4501, Time=8860
Count=4601, Time=9055
FPS: 40.20
Count=4701, Time=9270
Count=4801, Time=9463
Count=4901, Time=9666
Count=5001, Time=9866
Count=5101, Time=10062
Count=5201, Time=10266
Count=5301, Time=10467
Count=5401, Time=10659
FPS: 39.89
Count=5501, Time=10857
Count=5601, Time=11060
- blocks
-
JDK-8101271 GUIMark2.Vector: performance of Prism HW pipeline is worse than Prism SW pipeline
-
- Closed
-
- relates to
-
JDK-8100627 TreeItem expand/collapse performance in hardware pipeline is almost 20 fps worse than in j2d pipeline
-
- Closed
-
-
JDK-8091120 Native PIsces rasterizer is slower on desktop Linux platforms
-
- Closed
-