Question

I have an IIS Web Server that hosts 400 web applications (distributed across 30 application pools). They are both ASP.NET applications and WCF Services end points. The server has 32GB of RAM and is usually running fast; although it's running at 95% memory usage. Worker processes each take between 500MB and 1.5GB of RAM.

I also have another box running SQL Server. That one has plenty of free memory.

Sometimes, the Web Server starts throwing SQL Timeout exceptions. A few per minutes at first and rapidly increasing to hundreds per minute; effectively making the server down. This problem affects applications in all pools. Some requests still complete but most of them don't. While this happens the CPU usage on the server is around 30% (which is the normal load on that box).

While this is happening, we can still use SQL Server Management Studio (from the IIS Server) to execute requests successfully (and fast).

The fix is to restart IIS. And then everything goes back to normal until the next time.

Because the server is running with very low memory, I feel like this is the cause. But I cannot explain the relationship between low memory and sudden bursts of SQL Timeout exceptions.

Any idea?

Was it helpful?

Solution

Memory pressure can trigger paging and garbage collection. Both introduce latency which would not be present otherwise.

GC'ing 32GB of data can take seconds. Why would all app processes GC at the same time? Because at about 95% memory utilization Windows sets a "low memory" event that the CLR listens to. It will try to release memory to help other processes.

If the applications get into a paging frenzy that would also explain huge delays in normal execution.

This is just guessing, though. You can try proving it by looking at the "Hard page faults/sec" counter. There also must be a counter for "full GC" or "Gen 2 GC".

The fix would be running at a higher margin to the physical memory limit.

OTHER TIPS

The first problem is to discover where the timeout is happening. Can you tell from the stack trace if the timeout is happening when executing a request against the database, or when connecting to the database? (Or even connecting to the web server?)

Timeouts executing database requests can be a variety of causes. The problem might be in the database with blocking processes, database maintenance (also locking), deadlocks, etc. When apps are running slowly, do you see a lot of entries in sys.dm_exec_requests, and if so, what are their wait_types?

Even if you can run SQL in the query window while the web server is timing out, that doesn't mean there isn't massive blocking or deadlocking going on.

If it is a timeout connecting to the database, then it is possible the ADO connection pools are overwhelmed and not getting cleaned up, or the database has a connection limit, and the web services are timing out waiting for a connection.

One of the best ways to find out what is going on is to capture a memory dump of the w3wp.exe process and analyze it. Even if you aren't adept at a debugger like WinDbg, Microsoft's DebugDiag tool can produce some nice reports with helpful information.

SqlCommand.CommandTimeout

This property is the cumulative time-out for all network reads during command execution or processing of the results. A time-out can still occur after the first row is returned, and does not include user processing time, only network read time.

It is a client based time out. If stuff is getting queued due to memory constraints then that could cause a timeout.

Are you retrieving a lot of data from these queries?

If some queries return a lot of data consider breaking them up and give the user a next and prior button.

Have you considered asynch like BeginExecuteReader?
The advantage is no timeout.
It does not release the calling thread.

isExecutingFTSindexWordOnce = true;
sqlCmdFTSindexWordOnce.BeginExecuteNonQuery(callbackFTSindexWordOnce, sqlCmdFTSindexWordOnce);
// isExecutingFTSindexWordOnce set to false in the callback
Debug.WriteLine("Calling thread active");

But I agree with your comment how to respond to the request as the answer does not come back to the calling thread.
Sorry I am used to WPF where I just update a public property on the call back.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top