Question

I have an .NET 3.5 SP1 application that is an Excel Add-in. The application is split into a parent AppDomain (Excel's) and a child domain in which we load all of our dlls. When we wish to update our application, we unload the child domain, replace the files and reload it.

Unfortunately unloading the domain will activate 2 worker threads and they will start consuming CPU cycles (20-40%).

If I debug with VS 2010, at the moment before and after the AppDomain.Unload, there are no threads alive with a call stack aside from Excel's main thread. The AppDomain.Unload is indeed unloaded because if I try to call Unload again I get an AppDomainUnloadedException.

If I use ProcessExplorer, I can see that 2 threads are busy working away, even when the VS debugger has breaked. Looking at the callstack reveals nothing because there are no symbols.

  • ntkrnlpa.exe+0x6eacb
  • ntkrnlpa.exe+0x2bfd0
  • hal.dll+0x2ef2
  • ntkrnlpa.exe+0x6a6cf
  • ntdll.dll+0xe514
  • mscorwks.dll+0x992d
  • mscorwks.dll+0x52568
  • mscorwks.dll+0x15b469
  • kernel32.dll+0xb729

If I use WinDbg, I can see the callstack for the 2 renegade threads. It's always the same thing:

  • WARNING: Stack unwind information not available. Following frames may be wrong.
  • ntdll!KiFastSystemCallRet
  • mscorwks+0x992d
  • mscorwks!InstallCustomModule+0x1eca0
  • mscorwks!CorExitProcess+0x503b
  • kernel32!GetModuleFileNameA+0x1ba

I created a very simple test application to load/unload a child assembly. When doing this with a simple 1-class assembly, it works without any problems. If I get it to load/unload the child domain of the real application, it triggers the same renegade threads.

The code that creates the child domain is as follows:

AppDomainSetup appSetup = new AppDomainSetup();
appSetup.ApplicationBase = baseDir;

var ps = new PermissionSet(System.Security.Permissions.PermissionState.Unrestricted);
return AppDomain.CreateDomain(name, null, appSetup, ps, null);

The communication from the parent to the child domain is via a proxy and reflection. The code to create it is below:

string assName = typeof(ApplicationProxy).Assembly.FullName;
string className = typeof(ApplicationProxy).FullName;

var obj = _childDomain.CreateInstanceAndUnwrap(assName, className, false, 
    System.Reflection.BindingFlags.Default,
    null, new object[]{_sessionGuid}, 
    CultureInfo.InvariantCulture,
    null, new Evidence(AppDomain.CurrentDomain.Evidence));

_proxy = (ApplicationProxy)obj;

I've googled the problem profusely and cannot find anybody with a similar problem. The application is 10 projects large so I can't post it.

I'm wondering if anybody has encountered something similar and has some tips for me. Otherwise does anybody have any thoughts on how to attack the problem?

Was it helpful?

Solution

Thanks to Hans for putting me on the right path.

There are a few classes with finalizers so I put a breakpoint in each one. In one of them, somebody calls ThreadPool.QueueUserWorkItem. The workitem never gets called and instead leaves these 2 threads (1 to abort the executing threads and 1 to finalize stuff) cycling forever.

I tested it in my test project and it is indeed the case.

Children, the lesson is don't let your manager write thread code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top