Take your PIC

Doing interviews is great, as there’s always something interesting that comes up in the conversation. In the latest one that I was part of, the issue of calling via an interface on the CLR was mentioned. I vaguely remembered a blog post from a long time ago that talks about how the implementation changed in the transition from .NET 1.1 to .NET 2. In .NET 2, the system uses what I’d call a PIC, an inline cache, for handling the call efficiently.

Of course I had to try this out for myself. To enable debugging inside Visual Studio, you need to be running the example as a release build, have “just my code” turned off and have “suppress JIT optimisations on load” turned off, and also have “Enable Unmanaged Debugging” set on the Debug tab of the project.

With that we can try running this small example.

interface ITest
{
    void CallMethod();
}

class Program : ITest
{
    public void CallMethod()
    {
        Console.WriteLine("boo");
    }

    static void Main(string[] args)
    {
        ITest target = new Program();
        for (int i =0; i < 1000; i++)
            target.CallMethod();
    }
}

Setting a breakpoint on the target.CallMethod() line, when we run it the first time, we see that the code here disassembles to:

            for (int i =0; i < 1000; i++)
00000011  xor         edi,edi
                target.CallMethod();
00000013  mov         ecx,esi
00000015  call        dword ptr ds:[00980010h]
            for (int i =0; i < 1000; i++)
0000001b  inc         edi 
0000001c  cmp         edi,3E8h

If we step over the call instruction and then example the address 980010, we see that it contains 986012. Disassembling this location, by going to the immediate window, loading sos and using the !u instruction we find:

.load sos
extension C:WINDOWSMicrosoft.NETFrameworkv2.0.50727sos.dll loaded
!u 00986012
PDB symbol for mscorwks.dll not loaded
Unmanaged code
00986012 50               push        eax
00986013 6800000300       push        30000h
00986018 E908F25079       jmp         79E95225

If we now let the code run for a while, say by running until the breakpoint is hit 5 times, we see that the code has been rewritten to

!u 00987012
Unmanaged code
00987012 81394C339700     cmp         dword ptr [ecx],97334Ch
00987018 0F85F32F0000     jne         0098A011
0098701E E985905200       jmp         00EB00A8

In the case when we take the jmp, we target the method itself.

!u 00EB00A8
Normal JIT generated code
ConsoleApplication43.Program.CallMethod()
Begin 00eb00a8, size 1a
>>> 00EB00A8 55               push        ebp
00EB00A9 8BEC             mov         ebp,esp
00EB00AB E840D24378       call        792ED2F0 (System.Console.get_Out(), mdToken: 06000772)
00EB00B0 8BC8             mov         ecx,eax
00EB00B2 8B1530203202     mov         edx,dword ptr ds:[02322030h] ("boo")
00EB00B8 8B01             mov         eax,dword ptr [ecx]
00EB00BA FF90D8000000     call        dword ptr [eax+000000D8h]
00EB00C0 5D               pop         ebp
00EB00C1 C3               ret

So what is 97334c. This value is simply the type handle for the Program type as we can see from the instance that we construct at the start of the Main method.

            ITest target = new Program();
00000000  push        ebp 
00000001  mov         ebp,esp
00000003  push        edi 
00000004  push        esi 
00000005  mov         ecx,97334Ch
0000000a  call        FFAB1FAC
0000000f  mov         esi,eax

The idea here is simple. At a given call site, if we call via an interface on a particular object type, then we are likely to call again with the same object type in subsequent calls. Hence we can optimise by hardwiring code which checks for the expected type and then jumps into the target method if the type is the same. If the type is different then we can branch to a fix-up routine instead.

Of course, it’s all a matter of tradeoffs. Rewriting the code means flushing instruction caches so we’d better make sure that we’re calling on the same object type multiple times. That’s why it takes a few calls before this optimisation happens… the runtime can monitor activity at the call site and dynamically optimise. All rather cool, I think you’ll agree.

Advertisements
This entry was posted in Computers and Internet. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s