Umbrella RFE for testing based on IR nodes/shapes.
A lot of tests are written based on an optimization/modification/fix on the IR. However, if these tests pass we still do not know if the originally intended IR modifications are still applied correctly. The question is how we could check IR modifications in a test to make sure future changes do not break these (e.g. a node transformation is not done anymore etc.). Something in this direction was already done in Valhalla (see framework  and example test ). Maybe this can be used as a starting point.
Description of the current framework in Valhalla
The goal of this framework is to verify that the code generated by C2 is optimized as expected.
Current tests either only verify correctness (jtreg tests) or overall performance (jmh benchmarks)
but a failing optimization does not necessarily trigger a drop in performance (especially if there
are no fine-tuned and targeted microbenchmarks). For example, we had many regressions in C2 in the
past that no one noticed because they would not show up in the larger benchmarks (which doesn't mean
that these optimizations were useless). Also, such regressions are often hidden/amortized by other
optimizations in the same build/release and therefore no one noticed.
Since inline types are a completely new feature of the Java language, the specific C2 optimizations
we added were not covered by any existing benchmarks. And even with targeted microbenchmarks, it's
not easy to capture breakages that naturally happen when prototyping. I therefore wrote a framework
that allows to define "match rules" on the Intermediate Representation (IR) generated by C2 to make
sure new optimizations/transformations are applied as expected. For example, one can now write a
correctness test that is annotated with a match rule that verifies that the compiled code does not
contain any allocations (which is one of the main optimizations for inline types). If, against
expectations, the compiled code does contain an allocation, the match rule will fail and report the
error right away.
The current implementation of the framework is rather simple (1000 lines of Java code) and works by
parsing/searching the textual output of the C2 IR (no changes to the VM code required). Of course,
there are several ways to improve this but for our inline type use cases this turned out to work
Ideally, the inline type specific framework would be refactored or re-written to be more generic and
upstreamed into mainline such that it can still be used by our inline type specific tests.
Current framework state summarized
- Started out small and simple, gradually became more powerful with more functionality and features, also got a little more complex to understand it by handling many testing scenarios
- More than just IR matching, it's now a testing framework with the capability to also verify IR by matching IR node patterns or make claims about the presence or absence or nodes, the number of specific nodes etc.
- Framework and tests are tightly coupled, profound knowledge of the internals of the framework are required to write new tests
- Highly depends on inline types
- Generalize the framework to also use it in mainline (upstream)
- Still need to support all tests currently present in Valhalla's testing (~750 tests)
* Update all tests using the old framework to use the new framework without removing functionality (tests should still work the same way)
- Do not modify old tests but rather add a second version for a test that could benefit from IR validation
Outline and next steps
- Familiarize with existing framework in Valhalla
- Define a clear a simple interface between the new framework and the tests to be written using it
* Should be easy to add new tests without needing to know internals of framework
- Implement new framework by taking over/adapting/rewriting/refactoring the old framework into a test library and make it more generic such that it can be used by all compiler tests
* Valhalla probably needs to add some inline type specific things to the framework again like specific inline node types (but that should be easy to add/remove again when moving to mainline)
* Gradually convert all tests using the old framework to use the new one. The new framework gets "free" testing by all the ~750 inline type test that are already there and with each newly one added.
* No need to maintain two frameworks
* Add separate tests to verify the correctness of the framework (e.g. a test for each matching rule that should match and not match etc.)
- Add new general non-Valhalla specific tests to further test and utilize the framework
- Upstream the new framework into mainline
- Extend the framework to support more match rules and additional functionality.
- Add match rules to existing mainline tests (or write new tests for existing optimizations) and use it with tests for new features.
- Current IR matching is done with PrintIdeal output, should we consider IGV xml output?
* Need to be careful not to do duplicated work if IGV gets an updated format at some point
- Should we add matching on additional flags?
Update February 3rd, 2021
Compiler team internal presentation of current state: http://cr.openjdk.java.net/~chagedorn/TestFramework/TestFramework.pdf
GitHub development branch: https://github.com/openjdk/valhalla/compare/lworld...chhagedorn:TestingFramework
There was a compiler team internal presentation of the current state of the new test framework with IR verification. There was a general agreement to keep things simple in a first version to cover the basics and already provide a good way to start writing tests with the framework. A summary of the framework:
- Lightweight testing framework
- IR verification with simple Regex matching on PrintIdeal and PrintOptoAssembly
- Easy to use, method annotation based
- Well suited for small/easy to medium sized tests
Summary of discussion/improvement possibilities:
- A way to parse the IR (where simple regex matching is not expressive enough) to query it. Some examples
- Search for patterns (e.g. node X after node Y)
- Search for other IR properties (e.g. offsets)
- Prune uninteresting nodes
- Apply matching only in specific loops, for example in the hottest loop
=> would be nice to have an API to query information about IR. PrintIdeal/PrintOptoAssembly are limited.
- Use annotations on classes instead on methods together with interfaces. This could be used for more complex tests where you can implement, for example, IR verification methods yourself to get some more control
- Forbid/let test fail if it deoptimizes (could be done with JFR, PrintCompilation or LogCompilation)
- Use drivers in Jtreg to simplify test setups
- Add vector nodes to standard IR nodes to choose from
- Use IGV xml files instead of PrintIdeal/PrintOptoAssemlby
- Should not spend time trying to convert old tests but rather encourage people to start writing new tests with it
- relates to
JDK-8271471 [IR Framework] Rare occurrence of "<!-- safepoint while printing -->" in PrintIdeal/PrintOptoAssembly can let tests fail
JDK-8273410 IR verification framework fails with "Should find method name in validIrRulesMap"
JDK-8272558 IR Test Framework README misses some flags
JDK-8263412 ClassFileInstaller can't be used by classes outside of default package
JDK-8267980 Add IR tests for JDK-8266601
JDK-8270823 Provide IR test for JDK-8270366
JDK-8043472 JEP 399: Intermediate-Representation Graph Serialization
JDK-8285965 TestScenarios.java does not check for "<!-- safepoint while printing -->" correctly
JDK-8255663 Support multiple tails in C2 type flow analysis
JDK-8262721 Add Tests to verify single iteration loops are properly optimized