Can two consecutive memory_order_release stores on the same thread be reordered with each other?

Question

Can two consecutive memory_order_release stores on the same thread be reordered with each other?

Can two consecutive memory_order_release stores on the same thread be reordered with each other? Either from the perspective of the same thread or a different thread loading them?

The documentation on CPP reference says:

A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store.

So in this example:

std::atomic<uint64_t> a;
std::atomic<uint64_t> b;

// ...

a.store(0xDEADBEFF, std::memory_order::memory_order_release);
b.store(0xBEEFDEAD, std::memory_order::memory_order_release);

I would expect that the a store cannot be reordered after the b store. However maybe the b store can still be reordered before the a store, which would be equivalent? I'm not sure how to read the language.

Put another way: the documentation seems to say the a store can't be moved down. Does it also guarantee b can't be moved up?

I am trying to determine if on another thread I acquire b and see 0xBEEFDEAD and then acquire a if I am guaranteed to see a is 0xDEADBEEF.

c++

multithreading

memory-barriers

memory-model

stdatomic

asked on Stack Overflow Jan 16, 2020 by

Joseph Garvin • edited Jan 17, 2020 by

curiousguy

2 Answers

The notion of reordering of memory operations (like reads and writes) is often used to make the issues of inter thread memory visibility more "concrete", as reordering tasks is an every day issue for any person that has blocked and unblocked things to do. But it isn't the basis of inter thread communication and memory visibility. And by the way the memory_order_x values are about visibility not "order". Don't use the term "memory order"!

Release semantic is defined by a promise to any thread that can see the stored value. (That is why release is only a property of a modification of shared variable; a read of an atomic object, even with memory_order_seq_cst memory visibility, could never be a release operation.)

A thread that sees the written value of a release operation can assume that previous operations are "finished". These operations on shared objects that have to be "finished" are the reads and writes and also the other stuff like construction of an object (which your source forgot to mention). Operations that were done "before" (previously in program execution order, or even in a different thread, transitively with the same "finished" property) can be seen as done by a thread that does a read acquire on the written value. (If you did a relaxed read, you can use an acquire barrier afterward to get acquire read semantic.)

It's important to note that the release and acquire operations are bounds and determine the mutual exclusion of operations, like with a mutex: the atomic object is used to obtain mutual exclusion between the written thread and the read thread.

a.store(0xDEADBEFF, std::memory_order::memory_order_release);

The store of a doesn't have to have any specific visibility as there is no previous memory operation (assuming we are at beginning of parallelism) to make visible.

b.store(0xBEEFDEAD, std::memory_order::memory_order_release);

That one release operation (on b) is important: the reason why the compiler can't "reorder" stuff is because other threads can read b (which isn't a thread private variable) and could see the specific value 0xBEEFDEAD and possibly conclude that the release occurred, and use acquire semantic to guarantee mutual exclusion of:

stuff before the store release
stuff after the load acquire

That is, only if the user code checks that the value was written, and only if the value could come from there. So essentially the user code implements the mutual exclusion protocol, but in the end the compiler makes it work.

Regarding the quote:

The documentation on CPP reference says:

A store operation with this memory order performs the release operation: no reads or writes in the current thread can be reordered after this store.

I can easily give at least three cases where the reordering is allowed.

The first and most obvious one is a reorder that is always done with function calls by compilers: a modification of a purely local variable not accessible from anywhere else and an external call. That is obviously not even preventable by a specific call like a barrier, as it's a general transformation.

The others are transformations that can't be made with an external function call, but atomic operations are known by the compiler unlike calls to separately compiled functions:

any action of a strictly function local thread communication primitive, be it a mutex or atomic variable, can be reordered with anything as no other thread can observe or interact with the variable;
when an atomic object A is manipulated in such way that the compiler can see all operations on it, if the value stored is never changed (it retains its original value), then any operation on another object can be reordered for example with a release store on A.

These might be pretty uninteresting and silly (who uses a mutex as a local variable?) special cases, but they logically exist.

answered on Stack Overflow Jan 16, 2020 by

curiousguy • edited Jan 17, 2020 by

curiousguy

// T1
a.store(0xDEADBEFF, std::memory_order::relaxed); // #1
b.store(0xBEEFDEAD, std::memory_order::release); // #2

// T2
if (b.load(std::memory_order::acquire) == 0xBEEFDEAD) {      // #3
   assert(a.load(std::memory_order::relaxed) == 0xDEADBEEF); // #4
}

1 is sequenced before 2. 2 synchronizes with 3, and 3 is sequenced before 4. That means 1 happens before 4. By [intro.races]p18, assuming that there are no other modifications to a, 4 must take its value from 1, i.e., the assert will never fire.

answered on Stack Overflow Jan 16, 2020 by

T.C. • edited Jan 16, 2020 by

T.C.

User contributions licensed under CC BY-SA 3.0