If you use R# to code generate, make sure to regenerate if things change

At work the other day when I was looking through a pull request. I noticed what looked like a very strange equality method on a C# struct, shown here as if it were named Foo.

public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (ReferenceEquals(this, obj)) return true;
if (obj.GetType() != this.GetType()) return false;
return Equals((Foo) obj);
}

The code looks very wasteful – you cannot inherit from structs, so rather than the GetType malarkey you’d surely expect just an ‘is’. In the end it turned out that the Equals method had been generated by R#, but at the time it was generated, the containing type had been a class and this was later converted to a struct.

For a struct, R# would have generated the much more reasonable looking

public override bool Equals(object obj)
{
if (ReferenceEquals(null, obj)) return false;
return obj is Foo && Equals((Foo) obj);
}

While I was checking my understanding of what was going on, I spent a little time looking at the x86 code that was generated by the various methods.

A cut down version of the code using is,

public bool Test2(object obj)
{
if (!(obj is A)) return false;
return true;
}

generates the code

0:004> u 002f00f8 002f0119
002f00f8 55              push    ebp
002f00f9 8bec            mov     ebp,esp
002f00fb 85d2            test    edx,edx                                               <<<< check null
002f00fd 740c            je      002f010b
002f00ff 813a2c381200    cmp     dword ptr [edx],12382Ch     <<<< check type handle
002f0105 7502            jne     002f0109
002f0107 eb02            jmp     002f010b
002f0109 33d2            xor     edx,edx                                            <<<< generate less than optimal code
002f010b 85d2            test    edx,edx
002f010d 7504            jne     002f0113
002f010f 33c0            xor     eax,eax
002f0111 5d              pop     ebp
002f0112 c3              ret
002f0113 b801000000      mov     eax,1
002f0118 5d              pop     ebp
002f0119 c3              ret

The C# ‘is’ can simply generate check that the type handle (vtable) for the object is the desired type.

The longer form code, simplified to

public bool Test1(object obj)
{
if (ReferenceEquals(null, obj)) return false;
if (obj.GetType() != GetType()) return false;
return true;
}

is a little more complicated.

0:004> u 002f0098 002f00e7
002f0098 55              push    ebp
002f0099 8bec            mov     ebp,esp
002f009b 57              push    edi
002f009c 56              push    esi
002f009d 50              push    eax
002f009e 8bf9            mov     edi,ecx
002f00a0 8bf2            mov     esi,edx
002f00a2 85f6            test    esi,esi     <<<<<<< null   check
002f00a4 7507            jne     002f00ad
002f00a6 33c0            xor     eax,eax
002f00a8 59              pop     ecx
002f00a9 5e              pop     esi
002f00aa 5f              pop     edi
002f00ab 5d              pop     ebp
002f00ac c3              ret
002f00ad b92c381200      mov     ecx,12382Ch             <<<< type handle for the structure type
002f00b2 e84920e2ff      call    00112100   <<<<<< ***** See below for this!!!!!!!!!!
002f00b7 8945f4          mov     dword ptr [ebp-0Ch],eax
002f00ba 8bce            mov     ecx,esi
002f00bc e86bad2472      call    mscorlib_ni+0x27ae2c (7253ae2c)    <<<<<< GetType
002f00c1 8bf0            mov     esi,eax
002f00c3 0fbe07          movsx   eax,byte ptr [edi]
002f00c6 8b55f4          mov     edx,dword ptr [ebp-0Ch]
002f00c9 884204          mov     byte ptr [edx+4],al
002f00cc 8bca            mov     ecx,edx
002f00ce e859ad2472      call    mscorlib_ni+0x27ae2c (7253ae2c)   <<<<<<<< GetType
002f00d3 3bf0            cmp     esi,eax
002f00d5 7407            je      002f00de
002f00d7 33c0            xor     eax,eax
002f00d9 59              pop     ecx
002f00da 5e              pop     esi
002f00db 5f              pop     edi
002f00dc 5d              pop     ebp
002f00dd c3              ret
002f00de b801000000      mov     eax,1
002f00e3 59              pop     ecx
002f00e4 5e              pop     esi
002f00e5 5f              pop     edi
002f00e6 5d              pop     ebp
002f00e7 c3              ret

In the **** line we allocate a boxed instance of Foo. The first line gets the type handle of the struct type, and this is passed as the first argument to the called method.
mov     ecx,12382Ch
call    00112100

Addresses differ as this is a different run, but the called method is doing allocation of a new boxed instance.

0:000>  u 00162100 0016211b
00162100 8b4104          mov     eax,dword ptr [ecx+4]
00162103 648b15380e0000  mov     edx,dword ptr fs:[0E38h]  <<< Use fast allocation buffer
0016210a 034240          add     eax,dword ptr [edx+40h]
0016210d 3b4244          cmp     eax,dword ptr [edx+44h]
00162110 7709            ja      0016211b
00162112 894240          mov     dword ptr [edx+40h],eax
00162115 2b4104          sub     eax,dword ptr [ecx+4]
00162118 8908            mov     dword ptr [eax],ecx   <<<<< Set the object header
0016211a c3              ret
0016211b e915788b73      jmp     clr!JIT_New (73a19935)   <<<< fast path not available so punt

I thought it was quite interesting to see the boxing required to call GetType on ‘this’ when ‘this’ is a struct instance, and it is also interesting to see the bump allocation that happens on the fast path allocation.

The key point however, is that code generation is all very well, but it is useful to record the assumptions behind the generation so that you can regenerate if things change.

Posted in Computers and Internet | Leave a comment

TPL Dataflow still has its uses

TPL Dataflow by Example: Dataflow and Reactive programming in .NET by Matt Carkci

We’ve been doing some coding at work on a distributed system which does some processing that seems to match the streaming approach of the dataflow library. There are loads of fairly interesting posts on the TDF (TPL Dataflow library), both on MSDN and on the PFX team blog, but sometimes a book is good as it gives a more rounded picture of the technology, and perhaps tells you about problems that users commonly have.

This book is very short, coming in at 52 pages, though there are numerous pages with full code listings (of which the actual code you are interested in forms a tiny percentage), so it is a fairly quick read. There’s a quick introduction to the dataflow library, and then the vast majority of the book is filled with the author going through the block types that come with the library – the execution blocks like ActionBlock, the buffering blocks like BufferBlock and the grouping blocks like BatchedJoinBlock. Each block type is accompanied with a simple code sample that takes many pages. This kind of material is available elsewhere.

The unique parts of the book are really the last 8 or so pages which list some ideas and gotchas when designing a dataflow program. These were an interesting read, and many of the items I hadn’t come across in a single source.

To be honest though, if you want to understand the implementation of Dataflow, this Channel 9 video interview with Stephen Toub gives a lot of very useful implementation detail about the interfaces that the TDF library defines. If you are considering writing your own blocks, then This document discusses many of the implementation issues that you will face if you want to write your own dataflow blocks – along the way, it gives more detail about the expected protocols behind the interfaces.

Posted in Books | Leave a comment

Some interesting bits and pieces

This series which implements a simplified browser is really interesting. I’ve tried to find my way around the HTML and CSS specifications in the past, and they are a fairly dry read, whilst seeing code and hence being able to get a grip on the algorithms is a great way of improving one’s understanding.

Lenses seems to be a common pattern to be making its way through the functional programming world, and this F# version looks interesting.

The type theory podcast is shaping up nicely. The initial podcast talks about the connection between testing and type theory, and the second episode covers Idris whose focus on dependent types seems to be quite the rage at the moment. The first episode mentions Px, a system for deriving programs from proofs. I still have the book on the bookshelf and it was the reason that I wanted to study for a Phd.

Mirage, a functional operating system is also looking interesting, as does Docker, an application container framework that runs on Linux. Linux seems to have a rich history of such containers. The background on how this is implemented is covered here.

Posted in Computers and Internet | Leave a comment

The best explanation of cache coherency I’ve seen

A Primer on Memory Consistency and Memory Coherence by Daniel Sorin, Mark Hill and David Wood

This is, by far, the best explanation on cache coherency and memory consistency that I have read. There are lots of book that give the subject some coverage – from programming manuals on various languages to works on parallel programming, but none of them match the standard of this text. It starts out by covering serial consistency and then gives a great explanation why you may want to weaken this model in order to gain performance. It then has a chapter on the predominant TSO/x86 model (which Intel’s documentation hints that they support) and then follows this with a great chapter on even weaker memory models. The text is introductory, but covers things in a way that made what I had read in the past consistent. The following chapters cover cache coherency protocols, from the shared bus MOESI protocol to the more scalable directory based protocols. I liked the way that the authors explain some of the gaps in the usual protocols, when the usual state machines fail to explain that often we need to get a response to a message before the transition can be fully made for the state of a particular cache line.

This introduction is fairly short and therefore quick to read, but explains a subject that is often covered in far less detail. Simply brilliant, answering all sorts of unanswered questions from past reads.

Posted in Computers and Internet | Leave a comment

csc generates code as well you know

I wondered the other day how the C# compiler, csc.exe, decides which version of the runtime to target. Some C# language features are effectively syntactic sugar that is logically translated into  other C# which is compiled. But when you target a certain version of the framework, how does the compiler change this code generation? The answer appears to be that it looks at the references that you compile against.

Take this small C# class definition in a file enumerable.cs

using System.Collections.Generic;

class Test
{
IEnumerable<int> DoGeneration()
{
yield return 10;
}
}

If you compile from the command line using

csc /t:library enumerable.cs

you’ll see that the generated class’s constructor uses CurrentManagedThreadId

pic1

If you compile the same file referencing the .NET 4 version of mscorlib.dll (via a reference Assembly on my machine which has 4.5 installed)

csc /t:library /nostdlib /r:”c:/Program Files (x86)/Reference Assemblies/Microsoft/Framework/.NETFramework/v4.0/mscorlib.dll” enumerable.cs

you’ll see that it references ManagedThreadId

 pic2

I think it’s rather cool that the C# translation can use different methods from the framework depending on what it expects to find on the target. It can lead to a few interesting and unexpected problems though.

Posted in Computers and Internet | Leave a comment

Making your C# more effective

Effective C#: 50 Specific Ways To Improve Your C# by Bill Wagner

This is one of those books that lists 50 different issues and pieces of advice for using the programming language – there are variants for C# and Java and many other languages. The items are divied into 6 different chapters.

Chapter one, “Language Idioms” discusses some language level issues – avoiding user defined conversion operators, conditional attributes instead of #if and using “is” and “as” instead of casting. All of the advice seems very reasonnable. The next chapter is on “Resource Management”, and goes into detail about class initialization and then covers the standard Dispose pattern. Immutability and when to use  a value type instead of a reference type are also covered.

Chapter three, “Expressing Designs in C#” is very good and full of lots of good advice. Limiting visiblity and not returning references to internal objects are covered, as are using interfaces instead of inheritance and the difference between interface methods and virtual methods. There are items on defining callbacks using delegates and using the event pattern for notifications. In this mixed bag, there is a discussion about making chunky rather than chatty calls, and also a discussion of co- and contra-variance.

Chapter four is entitled “Working With The Framework”. This covers ordering relationshps with IComparer<T> and IComparable<T> and then moves onto writing parallel algorithms using PLINQ.

Chapter five covers the dynamic type, in a chapter with the rather misleading title of “Dynamic Programming in C#”. There’s a lot of discussion about how dynamic works, and a good explanation of expression trees.

The last chapter, “Miscellaneous”, throws in some advice about boxing and structuring applications as sets of small assemblies.

There were some items that I found very useful. How to use IFormattable to define better string representations for types, minimizing duplicated constructor logic and some of the PLINQ notes in particular. Not a bad book, but some of the advice is either well known or potentially just a matter of opinion.

The author appears on a recent .NET Rocks where he discusses C# 6.0.

Posted in Books | Leave a comment

A good read on concurrency in general (and Java in particular)

Java Concurrency in Practice by Brian Goetz et al

I’d been hearing people mention this book for a long time, and took quite a while to actually get hold of a copy to read. It’s a very good book. Its strength is that is covers the Java concurrency related libraries in good detail, but more than this, there is a lot of supporting material on concurrency in general.

The second chapter of the book covers what it means for something to be thread safe. This term is thrown around a lot by developers, but the chapter makes it clear that there are lots of properties that you may require of a thread safe class, and it is surprisingly difficult to pin down exactly what thread safety is. The next two chapters of the book talk about the sharing and composing of objects, and how this affects their concurrent use. The last chapter of the introductory section covers the basic building blocks, including synchronised and concurrent collections, blocking queues, blocking and interruptible methods, and synchronisers. All of the topics are covered in depth, and I think most people would learn something from reading them.

The second section of the book is about structuring concurrent applications. It covers the notion of tasks, and the connected notions of cancellation and shutdown. The Java mechanism for running thread pools is covered, and this is followed by a discussion of the interplay between GUI applications and concurrency. This has a very well written section on why GUIs are single threaded, answering well the question of why do I always need to move onto the GUI thread before changing GUI elements.

Section three is on liveness, performance and testing. This covers livelocks, deadlocks, performance, scalability, Amdahl’s law and also has a good discussion of testing concurrent programs.

Section four, advanced topics, covers locks and moves on to a discussion of when you might want to use atomic variables and non-blocking synchronisation. There is also a section on the Java memory model and the subtle guarantees of the platform such as how it needs to ensure that publication happens correctly after a constructor runs. There is also a chapter on AbstractQueuedSynchronizer a class which acts a superclass for many of the Java library’s synchronization constructs.

This is a thoroughly interesting book, both explaining the Java library very well and providing a lot of general advice on the topic of concurrency, and is well worth a read.

Posted in Books | Leave a comment