Question

My Drupal 6 site has been running smoothly for years but recently has experienced intermittent periods of extreme slowness (10-60 sec page loads). Several hours of slowness followed by hours of normal (4-6 sec) page loads. The page always loads with no error, just sometimes takes forever.

My setup:

  • Windows Server 2003
  • Apache/2.2.15 (Win32) Jrun/4.0
  • PHP 5
  • MySql 5.1
  • Drupal 6
  • ColdFusion 9
  • Vmware virtual environment
  • DMZ behind a corporate firewall
  • Traffic: 1-3 hits/sec peak

Troubleshooting

  • No applicable errors in apache error log
  • No errors in drupal event log
  • Drupal devel module shows 242 queries in 366.23 milliseconds,page execution time 2069.62 ms. (So it looks like queries and php scripts are not the problem)
  • NO unusually high CPU, memory, or disk IO
  • Cold fusion apps, and other static pages outside of drupal also load slow
  • webpagetest.org test shows very high time-to-first-byte

The problem seems to be with Apache responding to requests, but previously I've only seen this behavior under 100% cpu load. Judging solely by resource monitoring, it looks as though very little is going on.

Here is the kicker - roughly half of the site's access comes from our LAN, but if I disable the firewall rule and block access from outside of our network, internal (LAN) access (1000+ devices) is speedy. But as soon as outside access is restored the site is crippled.

Apache config? Crawlers/bots? Attackers? I'm at the end of my rope, where should I be looking to determine where the problem lies?

------Edit:-----

Attached is a waterfall chart from webpagetest.org showing a 15 second load time. I've seen times as high as several minutes. And again, the server runs fine much of the time. The green areas indicate that the browser has sent a request and is waiting to recieve the first byte of data back from the server. This is certainly a back-end delay, but it is puzzling that the CPU is barely used during this slowness.

(Not enough rep to post an image, see https://webmasters.stackexchange.com/questions/54658/apache-very-high-page-load-time

------Edit------

On the Apache side of things - Is this possibly a ThreadsPerChild issue?

Was it helpful?

Solution

After much research, I may have found the solution. If I'm correct, it was an apache config problem. Specifically, the "ThreadsPerChild" directive. See... http://httpd.apache.org/docs/2.2/platform/windows.html

Because Apache for Windows is multithreaded, it does not use a separate process for each request, as Apache can on Unix. Instead there are usually only two Apache processes running: a parent process, and a child which handles the requests. Within the child process each request is handled by a separate thread.

ThreadsPerChild: This directive is new. It tells the server how many threads it should use. This is the maximum number of connections the server can handle at once, so be sure to set this number high enough for your site if you get a lot of hits. The recommended default is ThreadsPerChild 150, but this must be adjusted to reflect the greatest anticipated number of simultaneous connections to accept.

Turns out, this directive was not set at all in my config and thus defaulted to 64. I confirmed this by viewing the number of threads for the second httpd.exe process in task manager. When the server was hitting more than 64 connections, the excess requests were simply having to wait for a thread to open up. I added ThreadsPerChild 150 in my httpd.conf.

Additionally, I enabled the apache status module http://httpd.apache.org/docs/2.2/mod/mod_status.html

...which, among other things, allows one to see the total number of active request on the server at any given moment. Right away, I could see spikes of up to 80 active request. Time will tell, but I'm confident that this will resolve my issue. So far, 30 hours without a hiccup.

OTHER TIPS

Apache is too bulk and clumsy for "1-3 hits/sec avg".

Once I have similar problem with much lighter (almost static-html, no DB) site, and similar hits/second. No errors, no high network/CPU/memory/disk loads. Apache on WinXP.

I inserted nginx before Apache for static files and it started working like a charm.

Caching. The solution it caching.

Drupal (in common with most other large CMS platforms) has a tendency toward this kind of thing due to its nature -- every page is built on the fly, constructed from a whole stack of database tables and code modules. The more you've got in there, the slower it will be, but even fairly simple pages can become horribly slow if your site gets a bit of traffic.

Drupal has a page cache mechanism built-in which will cut your load dramatically. As long as your pages are static (ie no dynamic content) then you can simply switch on caching and watch the performance go right back up.

If you have dynamic content, you can still enable caching for the static parts of the page. It is a bit more complex (and beyond the scope of this answer), but it is worth the effort.

If that's still not enough, a server-based caching solution such as Varnish will definitely help.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top