#### Description

** Problem Description

We found one defect in adlc/dfa, in particular the child constraints

generated for vector unary operations.

Different from unpredicated vector unary operation, which has only one

child, predicated vector unary operation works like a binary node with

two children in fact. In this way, the child constraint generated for

*predicated vector unary operation* is the super set of that generated

for the *unpredicated* version. As a result, there exists a risk for

predicated vector unary operations to match the unpredicated rules by

accident.

Currently both SVE in AArch64 and AVX512 in x64 can bypass such

failures. 1) One extra predicate constraint is added by AArch64, and 2)

lower instruction cost is set for predicated rules in x64.

Even so, we still think it's a time bomb and we should fix it.

** PoC

Here is a PoC. See https://github.com/shqking/jdk/commit/50ec9b1923bedd89854366f2a95fec1ff3e6c787

In order to expose the failure, we made minor updates in this PoC, 1)

setting the operand cost of pRegGov as the default value, and 2)

removing the extra predicate constraint in vabsI() rule.

Take the AArch64 SVE rules for AbsVI node as an example. For the

unpredicate rule vabsI() and the predicated version vabsI_masked(), the

"match" statements are quite similar except that "pg" is treated as the

right child.

With this PoC, the dfa state generated for AbsVI is shown as below.

The log is simplified.

```

void State::_sub_Op_AbsVI(const Node *n){

if( STATE__VALID_CHILD(_kids[0], VREG) && STATE__VALID_CHILD(_kids[1], PREGGOV) && <---- 1

( UseSVE > 0 ) )

{

unsigned int c = _kids[0]->_cost[VREG]+_kids[1]->_cost[PREGGOV] + SVE_COST; <---- 2

DFA_PRODUCTION(VREG, vabsI_masked_rule, c)

}

if( STATE__VALID_CHILD(_kids[0], VREG) && <---- 3

( UseSVE > 0) )

{

unsigned int c = _kids[0]->_cost[VREG] + SVE_COST; <---- 4

if (STATE__NOT_YET_VALID(VREG) || _cost[VREG] > c) { <---- 5

DFA_PRODUCTION(VREG, vabsI_rule, c)

}

}

...

```

1) Child constraint for predicated version at line 1 contains that for

unpredicated version at line 3.

2) The two predicate constraints are same, mainly because we removed the

extra check for vabsI() rule in this PoC.

Hence both the two if-stmts at line 1 and line 3 would be matched for

predicated version, and the lower cost production would be finally

selected at line 5. Since we modified the operand cost for pRegGov, the

cost of predicated version at line 2 is greater than that of

unpredicatred version at line 4.

As a result, vabsI() rule would be matched for predicated version, which

is not our intention. In my local test on one SVE machine, jtreg test

case jdk/incubator/vector/IntMaxVectorTests.java failed.

** Impact

All vector unary operations for which the predicated operation is

available, are affected. Here is the list.

```

AbsV

NegV

SqrtV

PopCountV

CountLeadingZerosV

CountTrailingZerosV

ReverseV

ReverseBytesV

MaskAll

VectorLoadMask

VectorMaskFirstTrue

```

Besides AArch64 SVE, x64 AVX-512 is affected as well. This PoC also

made minor update to x86.ad, increasing the cost for vabs_masked(). In

my local test, unexpected rule is matched and the jtreg test failed as

well.

We argue that

1) Mitigations done by AArch64 and x64 can bypass the potential matching

failure, but we don't think they fundamentally resolve the problem.

2) The root cause lies in that the child constraints generated for

predicated and unpredicated vector unary operations should be exclusive.

We found one defect in adlc/dfa, in particular the child constraints

generated for vector unary operations.

Different from unpredicated vector unary operation, which has only one

child, predicated vector unary operation works like a binary node with

two children in fact. In this way, the child constraint generated for

*predicated vector unary operation* is the super set of that generated

for the *unpredicated* version. As a result, there exists a risk for

predicated vector unary operations to match the unpredicated rules by

accident.

Currently both SVE in AArch64 and AVX512 in x64 can bypass such

failures. 1) One extra predicate constraint is added by AArch64, and 2)

lower instruction cost is set for predicated rules in x64.

Even so, we still think it's a time bomb and we should fix it.

** PoC

Here is a PoC. See https://github.com/shqking/jdk/commit/50ec9b1923bedd89854366f2a95fec1ff3e6c787

In order to expose the failure, we made minor updates in this PoC, 1)

setting the operand cost of pRegGov as the default value, and 2)

removing the extra predicate constraint in vabsI() rule.

Take the AArch64 SVE rules for AbsVI node as an example. For the

unpredicate rule vabsI() and the predicated version vabsI_masked(), the

"match" statements are quite similar except that "pg" is treated as the

right child.

With this PoC, the dfa state generated for AbsVI is shown as below.

The log is simplified.

```

void State::_sub_Op_AbsVI(const Node *n){

if( STATE__VALID_CHILD(_kids[0], VREG) && STATE__VALID_CHILD(_kids[1], PREGGOV) && <---- 1

( UseSVE > 0 ) )

{

unsigned int c = _kids[0]->_cost[VREG]+_kids[1]->_cost[PREGGOV] + SVE_COST; <---- 2

DFA_PRODUCTION(VREG, vabsI_masked_rule, c)

}

if( STATE__VALID_CHILD(_kids[0], VREG) && <---- 3

( UseSVE > 0) )

{

unsigned int c = _kids[0]->_cost[VREG] + SVE_COST; <---- 4

if (STATE__NOT_YET_VALID(VREG) || _cost[VREG] > c) { <---- 5

DFA_PRODUCTION(VREG, vabsI_rule, c)

}

}

...

```

1) Child constraint for predicated version at line 1 contains that for

unpredicated version at line 3.

2) The two predicate constraints are same, mainly because we removed the

extra check for vabsI() rule in this PoC.

Hence both the two if-stmts at line 1 and line 3 would be matched for

predicated version, and the lower cost production would be finally

selected at line 5. Since we modified the operand cost for pRegGov, the

cost of predicated version at line 2 is greater than that of

unpredicatred version at line 4.

As a result, vabsI() rule would be matched for predicated version, which

is not our intention. In my local test on one SVE machine, jtreg test

case jdk/incubator/vector/IntMaxVectorTests.java failed.

** Impact

All vector unary operations for which the predicated operation is

available, are affected. Here is the list.

```

AbsV

NegV

SqrtV

PopCountV

CountLeadingZerosV

CountTrailingZerosV

ReverseV

ReverseBytesV

MaskAll

VectorLoadMask

VectorMaskFirstTrue

```

Besides AArch64 SVE, x64 AVX-512 is affected as well. This PoC also

made minor update to x86.ad, increasing the cost for vabs_masked(). In

my local test, unexpected rule is matched and the jtreg test failed as

well.

We argue that

1) Mitigations done by AArch64 and x64 can bypass the potential matching

failure, but we don't think they fundamentally resolve the problem.

2) The root cause lies in that the child constraints generated for

predicated and unpredicated vector unary operations should be exclusive.