No change here

The BCL immutable collections were released on Nuget some time ago, but I’ve only just got to play with them. There has recently been an accompanying Channel 9 interview discussing the implementation.

So called persistent collections have been the main data structure in the Clojure language for quite some time. Clojure emphasis functional paradigms, and its implementation of persistent datatypes offer good performance while keeping the benefits of no mutation – friendliness to concurrency and removal of the need for compensation actions if an exception is thrown when state is being modified.

Persistent trees seem to have become popular at Microsoft because of the Roslyn project. This project is a rewrite of the C# compiler in C#, and it is in this scenario that there is a need to efficiently modify large abstract syntax trees without copying, a scenario that perfectly matches the persistent types.

It’s easy to add the immutable collections to your project.
    Install-Package Microsoft.Bcl.Immutable –Pre

And then you can start making instances using the slightly strange pattern.
    var x = ImmutableList.Create<int>();

The pattern follows the example of the Tuple type in .NET 4. C# cannot infer the instantiation required for the type parameters of a generic class at the class level, but it can do so for method level type parameters. This syntax avoids the need to specify the type parameters in examples like:
    var x2 = ImmutableList.Create(1, 2, 3);

Using Reflector it is interesting to look at the implementation of the type itself which is based on a binary tree. Such an implementation makes it possible to frequently share most of the tree except nodes along the spine that is modified by an operation.

There are a few optimisations in the implementation. Immutable trees can be converted to mutable trees, which can implement some operations more efficiently. These can then be converted back into immutable trees, as in the following example.

var x = ImmutableList.Create<int>();
var y = x.AddRange(new [] {1,2,3,4});
var z = y.ToBuilder();
z.Add(10);
var z2 = z.ToImmutable();

The idea is that we can locally make things mutable, in order to improve efficiency, and then convert back to immutable before passing the value out of the local context. This kind of idea is used in F#, where lots of implementation is locally mutable, but non-locally things look very functional and are therefore easier to reason about.

The implementation of Sort, as shown by Reflector, pushes the contents into an Array, sorts the array and then reproduces an immutable tree.

internal ImmutableList<T>.Node Sort(int index, int count, IComparer<T> comparer)
{
    Requires.Range(index >= 0, “index”, null);
    Requires.Range(count >= 0, “count”, null);
    Requires.Argument((index + count) <= this.Count);
    Requires.NotNull<IComparer<T>>(comparer, “comparer”);
    T[] array = new T[this.Count];
    this.CopyTo(array);
    Array.Sort<T>(array, index, count, comparer);
    return ImmutableList<T>.Node.NodeTreeFromList(array.AsOrderedCollection<T>(), 0, this.Count);
}

In contrast, the Reverse operation, uses the representation,

internal ImmutableList<T>.Node Reverse(int index, int count)
{
    Requires.Range(index >= 0, “index”, null);
    Requires.Range(count >= 0, “count”, null);
    Requires.Range((index + count) <= this.Count, “index”, null);
    int num = index;
    for (int i = (index + count) – 1; num < i; i–)
    {
        T local = node[num];
        T local2 = node[i];
        node = node.ReplaceAt(i, local).ReplaceAt(num, local2);
        num++;
    }
    return node;
}

relying on the fact that ReplaceAt uses a Mutate operation which can work slightly more efficiently if the tree is mutable (ie the frozen property is set to false). This frozen properly requires that the whole tree is traversed in order to set the frozen property of each node in it.

private ImmutableList<T>.Node Mutate(ImmutableList<T>.Node left = null, ImmutableList<T>.Node right = null)
{
    if (this.frozen)
    {
        return new ImmutableList<T>.Node(this.key, left ?? this.left, right ?? this.right, false);
    }
    if (left != null)
    {
        this.left = left;
    }
    if (right != null)
    {
        this.right = right;
    }
    this.height = 1 + Math.Max(this.left.height, this.right.height);
    this.count = (1 + this.left.count) + this.right.count;
    return (ImmutableList<T>.Node) this;
}

To see how the tree gets rebalanced, look at the RotateLeft and RotateRight on the Node inner class in the ImmutableList<T> class.

The assembly also contains an immutable stack, which has Peek and Pop operations, and a pair of such stacks are used to implement immutable queues in the classic way (We push enqueued items on to one stack, and dequeue elements by popping – when there are no more elements to pop, we take the push stack and reverse it and then construct a new pop stack from the resulting elements). There are also DebuggerProxy implementations which allow Visual Studio’s visualizers to present a clearer view of the datatypes when debugging. They also throw in Dictionary, Set and Sorted Dictionaries giving a very useful set of datatypes which I’m dying to use for real.

Advertisements
This entry was posted in Computers and Internet. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s