I've been doing some research on STM (software transactional memory) implementations, specifically on algorithms that utilize locks and are not dependent on the presence of a garbage collector in order to maintain compatibility with non-managed languages like C/C++. I've read the STM chapter in Herlihy and Shavit's "The Art of Multiprocessor Programming", as well as read a couple of Shavit's papers that describe his "Transactional Locking" and "Transactional Locking II" STM implementations. Their basic approach is to utilize a hash-table that stores the values of a global version-clock and a lock to determine if a memory location has been touched by another thread's write. As I understand the algorithm, when a writing transaction is performed, the version-clock is read and stored in thread-local memory, and a read-set and write-set are also created in thread-local memory. Then the following steps are performed:
If any of the above check-steps fail (i.e., steps #1, #3, and #5), then the write-transaction is aborted.
The process for a read-transaction is a lot simpler. According to Shavit's papers, we simply
If either step #2 or #4 fail, then the read-transaction is aborted.
The question that I can't seem to solve in my mind though is what happens when you attempt to read a memory location inside an object that is located in the heap, and another thread calls delete
on a pointer to that object? In Shavit's papers, they go into detail to explain how there can be no writes to a memory location that has been recycled or freed, but it seems that inside of a read-transaction, there is nothing preventing a possible timing scenario that would allow you to read from a memory location inside of an object that is has been freed by another thread. As an example, consider the following code:
Thread A
executes the following inside of an atomic read-transaction: linked_list_node* next_node = node->next;
Thread B
executes the following: delete node;
Since next_node
is a thread-local variable, it's not a transactional object. The dereferencing operation required to assign it the value of node->next
though actually requires two separate reads. In between those reads, delete
could be called on node
, so that the read from the member next
is actually reading from a segment of memory that has already been freed. Since the reads are optimistic, the freeing of the memory pointed to by node
in Thread B
won't be detected in Thread A
. Won't that cause a possible crash or segmentation fault? If it does, how could that be avoided without locking the memory locations for a read as well (something that both the text-book as well as the papers denotes is unnecessary)?
The simple answer is that delete
is a side effect, and transactions do not play nice with side effects.
Logically, because transactions can be rolled back at any time, you can't deallocate memory in the middle of a transaction.
I don't think there is a single universal answer to "how this shall be handled", but a common approach is to defer the delete
call until commit-time. The STM API should either do this automatically (for example providing their own delete
function and requiring you to do that), or by giving you a hook where you can register "actions to perform on commit". Then during your transaction you can register an object to be deleted if and when the transaction commits.
Any other transaction working on the deleted object should then fail the version check and roll back.
Hope that helps. There isn't a simple answer to side effects in general. It's something each individual implementation will have to come up with mechanisms to handle.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With