I was just thinking about multi-threaded reference counting, searched for it and found many posts, that basicly only mention the problem of atomicity, many answers even here on stackoverflow miss the actual problems involved in multi-threaded reference counting. So what's the fundamental problem.
Let's assume an object type with a reference counter like
struct SomethingRefcounted {
int refcount;
// other stuff
} * sr;
Reference counting means, our reference counter shall equal the number of references to the object at all times, possibly being slightly higher temporarily during pointer assignments.
For all further code, I assume volatile and atomic operations.
Now, each time, we create a new reference, we do an implicit ++sr->refcount;
each time we remove the reference, we do an implicit if (!--sr->refcount) free(sr);
.
However, if one thread does the decrement and another thread tries the increment at the same time, we get a race, with can only be understood considering CPU registers.
SomethingRefcounted * sr1, * sr2;
int thread_1() {
for (;;) sr1 = NULL;
}
int thread_2() {
for (;;) sr2 = sr1;
}
int thread_3() {
for (;;) sr1 = new SomethingRefcounted;
}
int main() {
create the threads and let them run
}
The problem here is thread_2: The moment it reads the pointer sr1 into a CPU register, it violates the assumption, the refcounter correctly counts the number of references. The refcounter is 1 after the new assignment to sr1 but even ignoring thread_1 for a moment, once sr1 is read into a CPU register by thread_2, there are 2 reference to the object, first the variable sr1 and the CPU register of thread_2, but the refcounter is still only 1, violating our constraint from above. The following increment will fix it, but this fix will come too late, if thread_1 decrements it to 0 before.
There are solutions involving locks (global locks, maybe hashed for many objects sharing one lock out of a pool of locks, so not every object needs it's own lock but also there doesn't need to be a single application wide lock causing lock contention). One solution I came up with however is to xchg the pointer on read, so the constraint, refcounter >= number of references is enforced. thread_2 in assembly could then look like this:
LOAD 0xdeadbeef -> reg1
L1:
XCHG [sr1] <-> reg1
CMP reg1 <-> 0xdeadbeef
JUMPIFEQUAL L1
INC [reg1]
STORE reg1 -> [sr1]
STORE reg1 -> [sr2] ; Actually, here, we have to do the
; refcount decrement on sr2 first, but you get the point
So, this is a simple spinlock, waiting for other concurrently running accesses like this to complete. Once we successfully XCHGed the pointer, no one else can get it, so we can be sure, the ref counter is at least 1 (it can't be 0, because we just found a reference to it) and it can't be decremented down to zero (even if there are more references, the one we have now, we have it exclusively in your CPU register, and that one contributes 1 to the refcounter, preventing it from reaching zero).
Similarily, thread_1 would look like this:
LOAD 0xdeadbeef -> reg1
L1:
XCHG [sr1] <-> reg1
CMP reg1 <-> 0xdeadbeef
JUMPIFEQUAL L1
LOAD 0 -> reg2
STORE reg2 -> [sr1]
DEC [reg1]
JUMPIFZERO free_it
Now, I am wondering if there are any more efficient solution to that problem. (Or even whether I miss something here).
User contributions licensed under CC BY-SA 3.0