Because iisreset
didn't helped and you had to restart whole machine, I would suspect it was a global resources shortage and mostly used website (or most resource consuming) was impacted. It could be because of not available RAM, network connections congestion due to some malfunctioning calls (for example a lot of CLOSE_WAIT
sockets exhausting connections pool, we've seen that in production because of malfunction of external service). It could be also one specific client problem, which was disconnected after machine restart so eventually the problem disappeared.
I would start from:
Historical analysis
- review Event Viewer to see any errors/warnings from that period of time,
- although you have already looked into IIS logs, I would do it once again with help of Log Parser Lizard to make some statistics like number of request per client, network bandwith per client, average response time per client and so on.
Monitoring
- continuously monitor Performance Counters:
\Processor(_Total_)\% Processor Time
,\.NET CLR Exceptions(_Global_)\# of Exceps Thrown / sec
,\Memory\Available MBytes
,\Web Service(Default Web Site)\Current Connections
(per each your site name),\ASP.NET v4.0.30319\Request Wait Time
,\ASP.NET v4.0.30319\Requests Current
,\ASP.NET v4.0.30319\Request Queued
,\Process(XXX)\Working Set
,\Process(XXX)\% Processor Time
(XXX per each w3wp process),\Network Interface(XXX)\Bytes total / sec
- run Performance Analysis of Logs (PAL) Tool during time of failure to make a very detailed analysis of performance counters data,
- run
netstat -ano
to analyze network traffic (or TCPView tool even better)
If all this will not lead you to any conclusion, create a Debug Diagnostic rule to create a memory dump of the process for long running requests and analyze it with WinDbg and PSSCor extension for .NET debugging.