C# fun with structs

I’ve just spent a while trying to catch up with all the changes to valuetypes in C# 7.x. As someone who doesn’t use the struct keyword in C# very often at all, it is quite amazing all of the changes to C# around this area. This talk is a very good summary.

The key point about structs is that they allow you to avoid some of the heap allocation that can be very bad for certain types of application. In the recent versions of C# the language has changed to allow the user to maintain a pointer to a struct, and hence use that rather than a copy of the original struct.

The classic example is something like the following (where we use a local function to make the example easier to read).

  var data = new[] {1, 2, 3, 4, 5};
  ref int second = ref GetItem(data);
  second = 5;

  ref int GetItem(int[] incoming)
  {
    ref int x = ref incoming[2];
    return ref x;
  }

In the example, the local variable, second, is an alias to the item in the data array, and so any change to either can be seen when viewing the data via the other item. The first thing one notices is the addition of a lot of “ref” keywords to make it clear that the local variable is holding an alias, and that the result of the method call is an alias to something else. It’s a shame that there was a better syntax for this.

There are two other things that have happened in this area. The use of the “in” modifier and the ability to define a struct as readonly.

In the classic struct as a parameter case, the struct is copied on the way in to the method. Hence, for the example struct,

struct A
{
  public int _x;
  public void Increment() => _x++;
}

we get the behaviour.

  A a = new A();
  Increment(a);

  void Increment(A x)
  {
    x.Increment();
    x.Increment();
    Debug.Assert(x._x == 2);
  }

  Debug.Assert(a._x == 0);

We can avoid the copy using ref, but this then affects the caller.

  A a = new A();
  Increment(ref a);

  void Increment(ref A x)
  {
    x.Increment();
    x.Increment();
    Debug.Assert(x._x == 2);
  }

  Debug.Assert(a._x == 2);

We can pass the argument as an “in” parameter, which stops it being copied on entry to the method.

  A a = new A();
  Increment(in a);

  void Increment(in A x)
  {
    x.Increment();
    x.Increment();
    Debug.Assert(x._x == 0);
  }

  Debug.Assert(a._x == 0);

Now, of course, we have to answer the question: how is it that we don’t see the parameter x being changed after the call to x.Increment. The answer is that the compiler takes a defensive copy when it makes the call. You can see this in the IL that is generated (and this is all covered really well by this blog post).

In the above code, the IL for the invocation of Increment is

    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: ldobj ConsoleApp15.Program/A
    L_0007: stloc.0 
    L_0008: ldloca.s a
    L_000a: call instance void ConsoleApp15.Program/A::Increment()
    L_000f: nop 
    L_0010: ldarg.0 
    L_0011: ldobj ConsoleApp15.Program/A
    L_0016: stloc.0 
    L_0017: ldloca.s a
    L_0019: call instance void ConsoleApp15.Program/A::Increment()
    L_001e: nop 

Changing the definition of A to

readonly struct A
{
  public readonly int _x;
  public void Increment() => Console.WriteLine(_x);
}

the compiler notices that the copy can be avoided and hence the IL changes to the more expected

    L_0000: nop 
    L_0001: ldarg.0 
    L_0002: call instance void ConsoleApp15.Program/A::Increment()

It all goes to show that valuetypes are a bit confusing in C#, and it is really hard to know whether you need to optimise them. You really need to do performance measurements to tell whether extra copying is really going to make a difference to the performance of your application.

I got interested in the struct related changes after looking into the implementation of Span, which is implemented as a “ref struct”. Span is a powerful new abstraction over data types such as arrays, and allows you to slice them without copying, and without performance sapping view types. To implement such a thing, the view, the Span instance, needs to be stack allocated and guaranteed to have a dynamic scope that causes it to be de-allocated before the stack frame is unwound – this is a new idea for the CLR which has never really guaranteed that stack allocated things were different before.

You can play with Span using the pre-release System.Memory Nuget package.

            var data = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
            var subdata = data.AsSpan().Slice(3, 4);

The subdata item is a Span, and looking at that in the debugger’s raw view, shows you that it has the fields:

   _byteOffset 0x00000010 System.IntPtr
   _length 4 int 
   _pinnable {int[9]} System.Pinnable 

This efficient implementation, means that we require the Span object is confined to the stack, which is all covered in this proposal document. Span (and the heap allocated version, Memory) are likely to make their way into many framework classes in the coming releases because of the amount that they can reduce allocation in cases where data is pulled from a buffer. The System.Memory package is .NET Standard and so is already available to run on a large number of platforms.

Advertisements
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s