That was quicker than I expected

Most of the time data that you need to share between threads is protected by a lock, and the barrier properties of a lock guarantee that stores and loads don’t get reordered by the combination of the compiler, the jit and the hardware. However, there are times when you aren’t using a lock to protect data in one of the threads, and in this case you need to be aware of the .NET memory model in order to be able to reason about the consequences.
Why am I thinking about this now? Well, Martin Simmons recently did a talk about the multi-processor extension to Lispworks. In order to give guarantees to a programmer, so that they can be sure that a program will work across architectures, the program needs to stay within the  realm of the definied behaviour of the memory model.
So what is the .NET memory model?
It is a set of rules that define how loads and stores to memory can appear to change order when seen from multiple threads. From a single thread, causality guarantees that a store to a location followed by a read will see the value that was written. However when a store to a location and a read from a different location occur, effects such as caching in the memory subsystem can be reasonned about by considering how the load and store may be reordered.
The CLR 2.0 memory model defines the following.
  (i) A load may move pass a load that follows in the instruction stream – load-load is possible.
  (ii) load-store is possible
  (iii) store-store is not possible
  (iv) store-load is possible.
There are then several kinds of fence that prevent the motion of loads and stores.
  (i) an acquire fence prevents loads and stores that follow it from moving before it.
  (ii) a release fence prevents loads and stores that precede it from moving after it
  (iii) a full fence prevents loads and stores moving across it in either direction.
Taking a lock (using the Monitor class for example), or using an Interlocked operation or using Thread.MemoryBarrier, all create a full fence. A read of a volatile or using Thread.VolatileRead is an acquire fence. A write of a volatile or using Thread.VolatileWrite is a release fence. Some compiler optimisations are also prevented; the compiler/JIT cannot introduce extra load or stores for volatile variables or for variabled that reference the GC heap, though it can merge multiple consectuive load or stores to the same location.
It is these kind of guarantees that almost makes the following singleton pattern safe on the CLR.
private Class m_Instance = null;        // Needs to be volatile
private object m_Lock = new object();
public Class Instance
   if (m_Instance == null)
     if (m_Instance == null)
      m_Instance = new Class();
In the weaker ECMA memory model, the store-store reorder may happen, so the stores initialising the new instance and the store setting the m_Instance variable
may intermix, potentially allowing another thread to see the partially constructed object. In the CLR 2.0 memory model, store-store isn’t allowed, so no other thread may see the partially constructed instance.
However, we DO need to make m_Instance volatile. The .NET memory model doesn’t guarantee that load-load reordering won’t happen (though in reality it will only happen on the IA64). This means that the data could be read from the fields of m_Instance before they are initialized, much in the same way the stores can happen out of order in the weaker memory models when the singleton instance is constructed.
Interesting stuff that makes it clear how hard it is to do lock-free programming on mutable datastructures.
This entry was posted in Computers and Internet. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s