Question

I know the title of my question is rather vague, so I'll try to clarify as much as I can. Please feel free to moderate this question to make it more useful for the community.

Given a standard LAMP stack with more or less default settings (a bit of tuning is allowed, client-side and server-side caching turned on), running on modern hardware (16Gb RAM, 8-core CPU, unlimited disk space, etc), deploying a reasonably complicated CMS service (a Drupal or Wordpress project for arguments sake) - what amounts of traffic, SQL queries, user requests can I resonably expect to accommodate before I have to start thinking about performance?

NOTE: I know that specifics will greatly depend on the details of the project, i.e. optimizing MySQL queries, indexing stuff, minimizing filesystem hits - assuming web developers did a professional job - I'm really looking for a very rough figure in terms of visits per day, traffic during peak visiting times, how many records before (transactional) MySQL fumbles, so on.

I know the only way to really answer my question is to run load testing on a real project, and I'm concerned that my question may be treated as partly off-top.

I would like to get a set of figures from people with first-hand experience, e.g. "we ran such and such set-up and it handled at least this much load [problems started surfacing after such and such]". I'm also greatly interested in any condenced (I'm short on time atm) reading I can do to get a better understanding of the matter.

P.S. I'm meeting a client tomorrow to talk about his project, and I want to be prepared to reason about performance if his project turns out to be akin FourSquare.

Was it helpful?

Solution

Very tricky to answer without specifics as you have noted. If I was tasked with what you have to do, I would take each component in turn ( network interface, CPU/memory, physical IO load, SMP locking etc) and get the maximum capacity available, divide by rough estimate of use per request.

For example, network io. You might have 1x 1Gb card, which might achieve maybe 100Mbytes/sec. ( I tend to use 80% of theoretical max). How big will a typical 'hit' be? Perhaps 3kbytes average, for HTML, images etc. that means you can achieve 33k requests per second before you bottleneck at the physical level. These numbers are absolute maximums, depending on tools and skills you might not get anywhere near them, but nobody can exceed these maximums.

Repeat the above for every component, perhaps varying your numbers a little, and you will build a quick picture of what is likely to be a concern. Then, consider how you can quickly get more capacity in each component, can you just chuck $$ and gain more performance (eg use SSD drives instead of HD)? Or will you hit a limit that cannot be moved without rearchitecting? Also take into account what resources you have available, do you have lots of skilled programmer time, DBAs, or wads of cash? If you have lots of a resource, you can tend to reduce those constraints easier and quicker as you move along the experience curve.

Do not forget external components too, firewalls may have limits that are lower than expected for sustained traffic.

Sorry I cannot give you real numbers, our workloads are using custom servers, high memory caching and other tricks, and not using all the products you list. However, I would concentrate most on IO/SQL queries and possibly network IO, as these tend to be more hard limits, than CPU/memory, although I'm sure others will have a different opinion.

OTHER TIPS

Obviously, the question is such that does not have a "proper" answer, but I'd like to close it and give some feedback. The client meeting has taken place, performance was indeed a biggie, their hosting platform turned out to be on the Amazon cloud :)

From research I've done independently:

  • Memcache is a must;
  • MySQL (or whatever persistent storage instance you're running) is usually the first to go. Solutions include running multiple virtual instances and replicate data between them, distributing the load;
  • http://highscalability.com/ is a good read :)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top