Why does my program consume 100% CPU under nVidia NView?
Question
I was recently working on a windows program that would sometimes become unresponsive when scrolling through a large list of items in a production environment. Of course it works fine on my desktop. The production Environment is:
- Windows XP based Workstation with 2 monitors
- nVidia Video Drivers with nView enabled
Of note is a Dr watson stack trace generated when the process is terminated:
State Dump for Thread Id 0xef4 eax=00e3fff8 ebx=000000a0 ecx=00e00000 edx=00000000 esi=0003fff8 edi=00e40000 eip=00b920c2 esp=0012bcac ebp=00000000 iopl=0 nv up ei ng nz na pe cy cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000283 \system32\nview.dll - function: nview!NVLoadDatabase 00b920a8 c80b0600 enter 0x60b,0x0 00b920ac 83c30f add ebx,0xf 00b920af 33f6 xor esi,esi 00b920b1 03f9 add edi,ecx 00b920b3 83e3f8 and ebx,0xfffffff8 00b920b6 3bcf cmp ecx,edi 00b920b8 89742414 mov [esp+0x14],esi 00b920bc 734c jnb nview!NVLoadDatabase+0xcaf (00b9210a) 00b920be 8bc1 mov eax,ecx 00b920c0 8b10 mov edx,[eax] 00b920c2 8b4004 mov eax,[eax+0x4] ds:0023:00e3fffc=00000000 00b920c5 89442414 mov [esp+0x14],eax 00b920c9 8bc2 mov eax,edx 00b920cb 2500000001 and eax,0x1000000 00b920d0 33ed xor ebp,ebp 00b920d2 0bc5 or eax,ebp 00b920d4 7414 jz nview!NVLoadDatabase+0xc8f (00b920ea) 00b920d6 8bc2 mov eax,edx 00b920d8 c1e008 shl eax,0x8 00b920db 8be8 mov ebp,eax 00b920dd c1f81f sar eax,0x1f ChildEBP RetAddr Args to Child 00000000 00000000 00000000 00000000 00000000 nview!NVLoadDatabase+0xc67
Why did this problem only occur in production?
Solution
This is interesting because nView is a 3rd party DLL provided by NVidia. Postings on the internet about nview!NVLoadDatabase
suggest that there is an unpatched defect in nview. This is supported by the fact that explorer uses 100% CPU, as confirmed by these reports. See: http://forums.nvidia.com/lofiversion/index.php?t36879.html
A detailed investigation of this problem is available on this site: http://blogs.technet.com/marcelofartura/archive/2007/02/28/real-case-random-apps-running-100-cpu.aspx
As per this article, the hang is due to an infinite loop in nview.dll. Although the assembly instructions and register values described online do not exactly match those in our log, they were close enough for me to conclude that it is the same issue.
To work around the problem, I disabled nView Desktop Manager (Right click on the desktop, select nView Properties, and click disable in the nView Desktop Manager groupbox). Before doing this I was able to consistently reproduce the hang. However, after disabling nView I could not reproduce the hang. Thus, this appears to be a viable workaround.
Anyway, I posted this up here in case it will be useful to anyone. It caused me a LOT of grief chasing this one down.