The field ParallelTaskTerminator::_offered_termination is modified atomically, but is read as a normal field access.
In ParallelTaskTerminator::offer_termination(), this field is read within a loop, but there's nothing stopping a compiler from hoisting the load out of the loop.
The _offered_termination field should be declared volatile.
This was found by code inspection. I haven't seen an actual error, but compiler optimizations or unrelated code changes could make an error occur in the future.
For your amusement, see what gcc -O3 does to this similar loop:
------
int wait_til_done(int* counter_ptr) {
__sync_add_and_fetch(counter_ptr, 1);
while (1) {
if (*counter_ptr == limit) {
return 1;
}
}
}
------
0000000000400550 <wait_til_done>:
400550: f0 83 07 01 lock addl $0x1,(%rdi)
400554: 8b 05 da 0a 20 00 mov 0x200ada(%rip),%eax # 601034 <limit>
40055a: 39 07 cmp %eax,(%rdi)
40055c: 74 02 je 400560 <wait_til_done+0x10>
40055e: eb fe jmp 40055e <wait_til_done+0xe>
400560: b8 01 00 00 00 mov $0x1,%eax
400565: c3 retq
GCC hoisted not only the load, but the entire if-statement, leaving an infinite loop behind.
In ParallelTaskTerminator::offer_termination(), this field is read within a loop, but there's nothing stopping a compiler from hoisting the load out of the loop.
The _offered_termination field should be declared volatile.
This was found by code inspection. I haven't seen an actual error, but compiler optimizations or unrelated code changes could make an error occur in the future.
For your amusement, see what gcc -O3 does to this similar loop:
------
int wait_til_done(int* counter_ptr) {
__sync_add_and_fetch(counter_ptr, 1);
while (1) {
if (*counter_ptr == limit) {
return 1;
}
}
}
------
0000000000400550 <wait_til_done>:
400550: f0 83 07 01 lock addl $0x1,(%rdi)
400554: 8b 05 da 0a 20 00 mov 0x200ada(%rip),%eax # 601034 <limit>
40055a: 39 07 cmp %eax,(%rdi)
40055c: 74 02 je 400560 <wait_til_done+0x10>
40055e: eb fe jmp 40055e <wait_til_done+0xe>
400560: b8 01 00 00 00 mov $0x1,%eax
400565: c3 retq
GCC hoisted not only the load, but the entire if-statement, leaving an infinite loop behind.