For some vector opcodes, there are no corresponding AArch64 NEON instructions but supporting them benefits vector API. Some of this kind of opcodes are also used by superword for auto-vectorization and here is the list:
```
VectorCastD2I, VectorCastL2F
MulVL
AddReductionVI/L/F/D
MulReductionVI/L/F/D
AndReductionV, OrReductionV, XorReductionV
```
We did some micro-benchmark performance tests on NEON and found that some of listed opcodes hurt the performance of loops after auto-vectorization, but others don't.
We should disable those opcodes for superword, which have obvious performance regressions after auto-vectorization on NEON
Specially, vector multiply long has been implemented but disabled. We should also re-enable the MulVL for vector API, but still disable it for auto-vectorization because of performance regression.
```
VectorCastD2I, VectorCastL2F
MulVL
AddReductionVI/L/F/D
MulReductionVI/L/F/D
AndReductionV, OrReductionV, XorReductionV
```
We did some micro-benchmark performance tests on NEON and found that some of listed opcodes hurt the performance of loops after auto-vectorization, but others don't.
We should disable those opcodes for superword, which have obvious performance regressions after auto-vectorization on NEON
Specially, vector multiply long has been implemented but disabled. We should also re-enable the MulVL for vector API, but still disable it for auto-vectorization because of performance regression.