-
Bug
-
Resolution: Fixed
-
P3
-
20, 21, 22
We have a microbenchmark to test the performance of a strlen call using panama. One such benchmark is:
@Benchmark
public int panama_strlen() throws Throwable {
try (Arena arena = Arena.ofConfined()) {
MemorySegment segment = arena.allocateUtf8String(str);
return (int)STRLEN.invokeExact(segment);
}
}
Here, we create a confined arena, allocate a string using the arena, then pass the string segment to the native function. This is a very idiomatic usage of the FFM API.
However, benchmarks reveal that the usage of try with resources here creates problems. That is, this benchmark come up at 100ns/op. But, if the code is rearranged as follows:
@Benchmark
public int panama_strlen() throws Throwable {
Arena arena = Arena.ofConfined();
MemorySegment segment = arena.allocateUtf8String(str);
int res = (int)STRLEN.invokeExact(segment);
arena.close();
return res;
}
}
Then the benchmark scores improves to 86ns/op. I have been able to reproduce similar numbers in other cases where a try-with-resources was used.
It would be nice if users didn't have to choose between code clarity and performance.
@Benchmark
public int panama_strlen() throws Throwable {
try (Arena arena = Arena.ofConfined()) {
MemorySegment segment = arena.allocateUtf8String(str);
return (int)STRLEN.invokeExact(segment);
}
}
Here, we create a confined arena, allocate a string using the arena, then pass the string segment to the native function. This is a very idiomatic usage of the FFM API.
However, benchmarks reveal that the usage of try with resources here creates problems. That is, this benchmark come up at 100ns/op. But, if the code is rearranged as follows:
@Benchmark
public int panama_strlen() throws Throwable {
Arena arena = Arena.ofConfined();
MemorySegment segment = arena.allocateUtf8String(str);
int res = (int)STRLEN.invokeExact(segment);
arena.close();
return res;
}
}
Then the benchmark scores improves to 86ns/op. I have been able to reproduce similar numbers in other cases where a try-with-resources was used.
It would be nice if users didn't have to choose between code clarity and performance.
- relates to
-
JDK-8267532 C2: Profile and prune untaken exception handlers
- Resolved