-
Enhancement
-
Resolution: Unresolved
-
P4
-
11, 12, 13, 14
-
x86
-
generic
## Reproduce
Run the reproducer with:
-1) java TestSuperWordOverunrolling
-2) java -XX:-SuperWordLoopUnrollAnalysis TestSuperWordOverunrolling
---------------------------------
public class TestSuperWordOverunrolling {
public static void main(String[] args) {
double sum = 0.0;
long start = System.currentTimeMillis();
for (int i = 0; i < 50000; i++) {
sum += execute(256);
}
long end = System.currentTimeMillis();
System.out.println("sum = " + sum + "; time = " + (end - start) + "ms");
}
public static double execute(int num_iterations) {
int M = 63;
byte[][] G = new byte[M][M];
int Mm1 = M-1;
for (int p = 0; p < num_iterations; p++) {
for (int i = 1; i < Mm1; i++) {
for (int j = 1; j < Mm1; j++)
G[i][j] = G[i-1][j];
}
}
return G[3][2];
}
}
---------------------------------
## Symptom
1) java TestSuperWordOverunrolling
---------------------------------
sum = 0.0; time = 9360ms
sum = 0.0; time = 9345ms
sum = 0.0; time = 9376ms
sum = 0.0; time = 9389ms
---------------------------------
2) java -XX:-SuperWordLoopUnrollAnalysis TestSuperWordOverunrolling
---------------------------------
sum = 0.0; time = 5564ms
sum = 0.0; time = 5575ms
sum = 0.0; time = 5520ms
sum = 0.0; time = 5552ms
---------------------------------
## Analysis
The performance drop was caused by over loop unrolling with SuperWordLoopUnrollAnalysis.
For this reproducer, the loop was unrolled by 16, which was bad for the performance.
Run the reproducer with:
-1) java TestSuperWordOverunrolling
-2) java -XX:-SuperWordLoopUnrollAnalysis TestSuperWordOverunrolling
---------------------------------
public class TestSuperWordOverunrolling {
public static void main(String[] args) {
double sum = 0.0;
long start = System.currentTimeMillis();
for (int i = 0; i < 50000; i++) {
sum += execute(256);
}
long end = System.currentTimeMillis();
System.out.println("sum = " + sum + "; time = " + (end - start) + "ms");
}
public static double execute(int num_iterations) {
int M = 63;
byte[][] G = new byte[M][M];
int Mm1 = M-1;
for (int p = 0; p < num_iterations; p++) {
for (int i = 1; i < Mm1; i++) {
for (int j = 1; j < Mm1; j++)
G[i][j] = G[i-1][j];
}
}
return G[3][2];
}
}
---------------------------------
## Symptom
1) java TestSuperWordOverunrolling
---------------------------------
sum = 0.0; time = 9360ms
sum = 0.0; time = 9345ms
sum = 0.0; time = 9376ms
sum = 0.0; time = 9389ms
---------------------------------
2) java -XX:-SuperWordLoopUnrollAnalysis TestSuperWordOverunrolling
---------------------------------
sum = 0.0; time = 5564ms
sum = 0.0; time = 5575ms
sum = 0.0; time = 5520ms
sum = 0.0; time = 5552ms
---------------------------------
## Analysis
The performance drop was caused by over loop unrolling with SuperWordLoopUnrollAnalysis.
For this reproducer, the loop was unrolled by 16, which was bad for the performance.
- relates to
-
JDK-8080325 SuperWord loop unrolling analysis
-
- Resolved
-