Windbg and the case of the unfreed garbage

Virtual machine technology has brought many benefits. The runtime can be much more featured than a standard runtime library of a language such as C, offering memory management, exceptions and thread management at a level that is easy to program, and in a manner that is much more cross platform. There are trade-offs at work here, as there are in all implementations. In particular, the garbage collector is implemented in user mode code (as is the whole CLR) and you get a complete garbage collectible heap in every process that is running managed code. Moreover, garbage collection is driven by allocation failure (or explicit user calls to the GC.Collect methods), waiting until there is no space in a generation before scheduling a collection, and using a number of heuristics in order to decide which of the generations should be collected. These heuristics also control when memory is returned back to the operating system. Typically, the larger the heap, the more efficient the garbage collection process, so memory can be held for longer than you might naively expect.

We’ve all used the task manager to look at the size of managed processes. When the process is blocked, no allocations means that no garbage collections are scheduled, and hence there is nothing which is going to release unused pages back to the operating system.

Of course it would be nice if the garbage collector could respond to the OS when memory gets low. But just does it do this? We might not have reflection in the unmanaged world, but a combination of symbols from a symbol server, and other tools makes it possible to investigate.

First, I knew that you can subscribe to the OS event that is raised when memory is running low via the function CreateMemoryResourceNotification. The first thing to check is whether the standard dlls that make up the CLR import this function from the OS interface libraries. Doing dumbin on clr and some other key dlls

   dumpbin c:\windows\Microsoft.NET\Framework\v4.0.30319\clr.dll /imports

listed imports for a number of libraries, but not this function.

Right, time to get dynamic. I wrote a quick console application that blocked inside Console.ReadLine() and then started it under WinDbg. Setting things up to use the Microsoft symbol server, and then using the examine command to find the symbol

  .symfix c:\localsymbols
  x *!*CreateMemoryResource*

found the symbol, so I could then breakpoint it.

   74870b4e          KERNELBASE!CreateMemoryResourceNotification
   bp KERNELBASE!CreateMemoryResourceNotification

We hit the symbol and I stepped up. This left the stack looking like this.

0:000> k
ChildEBP RetAddr 
00dbfa14 64773428 clr!WKS::gc_heap::initialize_gc+0xb0
00dbfa1c 6475e0da clr!WKS::GCHeap::Initialize+0x2f
00dbfa24 6476ae02 clr!ExecuteDLL+0x36d
00dbfb9c 6475ba2f clr!EEStartupHelper+0x846
00dbfbe4 647744fe clr!EEStartup+0x1e
00dbfc28 64801f39 clr!EnsureEEStarted+0xea
00dbfc68 64804162 clr!_CorExeMainInternal+0x8f
00dbfca4 71fff5a3 clr!_CorExeMain+0x4d
00dbfce0 72077f16 mscoreei!_CorExeMain+0x10a
00dbfcf0 72074de3 MSCOREE!ShellShim__CorExeMain+0x99
00dbfcf8 74e68543 MSCOREE!_CorExeMain_Exported+0x8
00dbfd04 7725bf39 KERNEL32!BaseThreadInitThunk+0xe
00dbfd48 7725bf0c ntdll!__RtlUserThreadStart+0x72
00dbfd60 00000000 ntdll!_RtlUserThreadStart+0x1b

The return register contains a handle

0:000> r eax
0:000> !handle 120 f
Handle 120
  Type             Event
  Attributes       0x10
  GrantedAccess    0x100001:
  HandleCount      22
  PointerCount     4573373
  Name             \KernelObjects\LowMemoryCondition
  Object Specific Information
    Event Type Manual Reset
    Event is Waiting

So the question is what waits on the handle? If we continue the application, we get to a point where there are only two managed threads running.

0:004> .loadby sos clr
0:004> !Threads
ThreadCount:      2
UnstartedThread:  0
BackgroundThread: 1
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
       ID OSID ThreadOBJ    State GC Mode     GC Alloc Context  Domain   Count Apt Exception
   0    1 2210 0119d608     2a020 Preemptive  02E34188:00000000 011655b0 1     MTA
   2    2 23c0 0116d458     2b220 Preemptive  00000000:00000000 011655b0 0     MTA (Finalizer)

The main thread is waiting inside a call to Console.Readline, The finalizer thread would be a potential thread that might wait on our event. After all, it spends most of its time waiting. So we can switch to it.

0:004> ~2s
eax=00000000 ebx=64cdf388 ecx=00000000 edx=00000000 esi=02cff850 edi=00000000
eip=7722e1a4 esp=02cff728 ebp=02cff8a8 iopl=0         nv up ei pl nz na pe nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
7722e1a4 c21400          ret     14h
0:002> k
ChildEBP RetAddr 
02cff724 7484c752 ntdll!NtWaitForMultipleObjects+0xc
02cff8a8 64787bc5 KERNELBASE!WaitForMultipleObjectsEx+0x10b
02cff8f4 647885f6 clr!WKS::WaitForFinalizerEvent+0xbe
02cff934 646e9843 clr!WKS::GCHeap::FinalizerThreadWorker+0x6e
02cff9c4 646e98fd clr!REGUTIL::EnvGetString+0xfc
02cff9d0 646e998a clr!SHash<StringSHashTraits<_ConfigStringKeyValuePair,unsigned short,CaseSensitiveStringCompareHash<unsigned short> > >::Lookup+0x11
02cffa58 647741c6 clr!EEConfig::GetConfiguration_DontUse_+0x1b0
02cffad4 647ca0c1 clr!WKS::GCHeap::FinalizerThreadStart+0x198
02cffb6c 74e68543 clr!Thread::intermediateThreadProc+0x4d
02cffb78 7725bf39 KERNEL32!BaseThreadInitThunk+0xe
02cffbbc 7725bf0c ntdll!__RtlUserThreadStart+0x72
02cffbd4 00000000 ntdll!_RtlUserThreadStart+0x1b

Now dump the stack to try to find parameters to the function we just returned from.

0:002> dd esp
02cff728  7484c752 00000003 64cdf388 00000001
02cff738  00000000 00000000 525d7525 011a6f40
02cff748  00000000 74e5002e 00000004 00000018
02cff758  40000060 00000000 7484ee9c 767f3b30
02cff768  00000000 0000020c 00000000 00000004
02cff778  64735f28 64735f28 7681301e 011b2ad0
02cff788  011b2abc 7722dc34 74841129 00000158
02cff798  00000000 74841151 525d7a75 000007d0

Looking at the values pointed to by the third item on the stack, it looks like a series of handles, the first of which is the handle to the notification object.

0:002> dd 64cdf388
64cdf388  00000120 00000158 000001b8 011a6f20
64cdf398  00000000 00000000 6475d2f4 00000005

So it looks like the handles are passed into the kernel wait method, and we need to see what happens to the return value from WaitForMultipleObjectsEx. I happen to know that it returns the index of the handle that was signalled, so we need to see what happens if the value zero is returned.

0:002> u 64787bc5
64787bc5 03c6            add     eax,esi
64787bc7 83e800          sub     eax,0
64787bca 0f844b451d00    je      clr!WKS::WaitForFinalizerEvent+0x12a (6495c11b)
64787bd0 83e802          sub     eax,2
64787bd3 7589            jne     clr!WKS::WaitForFinalizerEvent+0x292 (64787b5e)
64787bd5 33f6            xor     esi,esi
64787bd7 c7442414ffffffff mov     dword ptr [esp+14h],0FFFFFFFFh
64787bdf 89742418        mov     dword ptr [esp+18h],esi

We test and branch if the result is zero to the following code.

0:002> u 6495c11b
6495c11b 8b0d18f6cd64    mov     ecx,dword ptr [clr!GCHeap::FinalizerThread (64cdf618)]
6495c121 c7410801000000  mov     dword ptr [ecx+8],1
6495c128 833d84f4cd6400  cmp     dword ptr [clr!g_TrapReturningThreads (64cdf484)],0
6495c12f 7405            je      clr!WKS::WaitForFinalizerEvent+0x145 (6495c136)
6495c131 e8d324efff      call    clr!Thread::RareDisablePreemptiveGC (6484e609)
6495c136 8b0d88f4cd64    mov     ecx,dword ptr [clr!g_pGCHeap (64cdf488)]
6495c13c 6a02            push    2
6495c13e 8b01            mov     eax,dword ptr [ecx]
6495c140 6a01            push    1
6495c142 6a00            push    0
6495c144 ff5064          call    dword ptr [eax+64h]
6495c147 8b0d18f6cd64    mov     ecx,dword ptr [clr!GCHeap::FinalizerThread (64cdf618)]
6495c14d c7410800000000  mov     dword ptr [ecx+8],0
6495c154 8b4104          mov     eax,dword ptr [ecx+4]
6495c157 a85f            test    al,5Fh
6495c159 7405            je      clr!WKS::WaitForFinalizerEvent+0x16f (6495c160)

The target address is fetched by indirection, so we have to look at the relevant memory locations.

0:002> dd 64cdf488
64cdf488  0116c760 64ce6d70 646b1cec 00000000

0:002> dd 0116c760
0116c760  6475e0e8 abababab abababab feeefeee

0:002> dd 6475e0e8 +64h
6475e14c  6478c77f 64794b2c 64a09013 647877ac

And so we arrive at a function named GarbageCollect, which probably does what it says on the can.

0:002> u 6478c77f
6478c77f 55              push    ebp
6478c780 8bec            mov     ebp,esp
6478c782 51              push    ecx
6478c783 53              push    ebx
6478c784 56              push    esi
6478c785 8b7508          mov     esi,dword ptr [ebp+8]
6478c788 57              push    edi
6478c789 894dfc          mov     dword ptr [ebp-4],ecx

In summary then, low memory notifications cause our process to do a garbage collection, presumably with the intention of releasing memory back to the system. Though we are missing metadata, as long as we don’t need to go too deep into the code of an unmanaged method, it is possible to figure out what is going on.

This entry was posted in Computers and Internet. Bookmark the permalink.

2 Responses to Windbg and the case of the unfreed garbage

  1. Pingback: GC performing full blocking Gen2 collections under SustainedLowLatency despite there is no LowMemory notification from OS - Jan Krivanek - Site Home - MSDN Blogs

  2. Alois Kraus says:

    This was really a great investigation. I have found that when my application is really slow it is due to constant full GCs coming from the Finalizer thread. Now the question remains when this notification is fired to prevent getting into this state.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s