Question

I'm currently working on a large Magento (Enterprise Edition) project where the client wants to use four load balanced web servers and four MySQL servers (one master, three slaves). I have not been able to find any clear information on the best way to configure Magento in this situation as far as cache and media storage. Anyone who has done something similar care to share their experience or point to a list of best practices for this?

My current plan is:

  • Store session data in DB
  • A single memcache server accessible from all four Apache servers for cache
  • A single Solr server for search
  • NFS drive mounted to /media on web servers

I've seen some recommendations where multiple memcache servers are used, such as one on each web server in the cluster. Wouldn't this cause problems and decrease performance when something is cached on one host and not the other?

Was it helpful?

Solution

Apache

You're on the right track. Read-balancing across multiple instances gives you maximum concurrency. Read the whitepaper from Nexcess released at Imagine 2013 and tune your Apache properly. Use PHP-FPM. Use fast caches - Redis for FPC, backend cache. Use multiple Memcached, specifically for sessions. Use APC.

I've seen some recommendations where multiple memcache servers are used, such as one on each web server in the cluster. Wouldn't this cause problems and decrease performance when something is cached on one host and not the other?

In short, no. Read up on how memcache distribution works. There is no "right" answer here. Catalog size, traffic patterns, peak bursts, will all determine how you should tune your store.

MySQL

The client wants to use four MySQL servers (one master, three slaves).

Having worked for some of the highest-traffic sites in Magento I can absolutely say that there is no need for read-balancing across four instances. Especially with the benefits in EE 1.13 and FPC.

Read some information from @sonassi below about configuring Magento to handle +1 db (it does it out of box).

How to optimise database architecture for high volume sites?

If you still want to do this, I recommend using MySQL Proxy to split off read/write traffic and round-robin it.

Other considerations

  • Gigabit!
  • Avoid virtualization
  • Run DB's with tons of RAM
  • Run Apache with tons of CPU, even more RAM.
  • Use FPC
  • Use Redis
  • Use cloud storage for media. If it's not an option, NFS is suitable.

Want even more? Read this:

http://www.sonassi.com/knowledge-base/mysql-and-magento-peformance-tuning/

OTHER TIPS

You’re on a great path with that cluster configuration. I strongly recommend a singular server for Redis or Memcached, rather than running instances on each web host.

Here’s the full list of configurations I’ve used for a highly available, fault tolerant, distributed, and load balanced LEMP cluster. It includes app/etc/local.xml, the core_config_data table, and configurations for MySQL, php-fpm, nginx, and Redis. All hosts run Ubuntu 12.04 LTS 64-bit.

You should consider moving one or two of those database hosts to a cache role. I’m seeing a much heavier load on the host running Redis (sessions, backend, and FPC) than I am on the MySQL master and slave.

Highlights

  • Admin users: 46
  • Categories: 2,450 (largest one has 2,400 products)
  • Product entities: 101,000
  • Combo products: 484
  • Product relations: 54,000
  • In stock and enabled configurable products: 10,100
  • CMS blocks: 3,100
  • CMS pages: 1,400

August 2013 traffic:

  • 40 million monthly pageviews
  • 2.3 million unique visitors
  • 46,000 monthly checkouts
  • 89% of visitors from the USA

Web hosts

There are 10 hosts behind redundant, highly available hardware firewalls and hardware load balancers.

  • site-wide average response time: 282 ms
  • load average: 0.6 to 1.0 (in tests, performance degrades by 35% when load averages hit ~5.0)
  • Dual Intel Xeon CPU E3-1230 V2 @ 3.30GHz (4 cores each)
  • 32 GB DDR3 1333 MHz RAM

Modules


Cache hosts

There are two hosts running Redis in a master-slave configuration with automated failover. Three Redis instances are used to increase throughput and provide fine-tuning of persistence behaviors.

  • 3,000 commands per second
  • 0.7 ms average response time
  • load average of 1.0 to 1.5
  • Quad Intel Xeon CPU E5-2620 0 @ 2.00GHz (6 cores each)
  • 128 GB buffered DDR3 1333 MHz RAM
  • Mechanical disks, RAID 1, hardware controller

Database hosts

There are two hosts running MySQL 5.6.11 in a master-slave configuration with warm failover.

  • 1,500 commands per second
  • 1.1 ms average response time
  • load average of 0.1 (master) and 0.4 (slave)
  • Quad Intel Xeon CPU E7- 2860 @ 2.27GHz (10 cores each)
  • 128 GB buffered DDR3 1333 MHz RAM
  • SSD, RAID 1+0, hardware controller
  • MySQL 5.6.11 with tcmalloc

It is quite simple, web clusters with single memcache and for high volume separate memcache for session. Db read/write are fine, NFS for media are needed anyway, if you have CDN and everything is cached properly at multiple levels it will rarely hit the file system, apache if you have the patience or one of the other web servers if you do not. You can have the highest performing top enterprise class site on standard installs with the correct config, it is not all about the absolute fastest technology.

Now your main problem, this setup is stupidly complex. There are two parts which are technical based for search engines such as Google, and business based for visitors. You need both to co-exist equally with your time and someone elses knowledge, or be purely technical focused and do it all yourself, or be business focused and leave it to be fully managed with people who have already done it. Nexcess charge $1,000s/mth for their clusters, the enterprise company we know charge $20,000+ for their cloud cluster config (new sites listing fist page next to Amazon, eBay, Asos, Net-A-Porter).

Why is it like this, these sites are in the top 1% and top 0.1% where almost all profits are made. This setup has the magic to be able to take a portion of where the majority of profits are and rank first page alongside the largest retailers. In the end it is the sum of the parts, you may find pieces here and there, but anyone with the combined answer will put you under nda, we are.

A simple way to get a clustered and highly-available environment for large projects is Auto-Scalable Magento Cluster in Containers. In short, the topology consists:

  • Varnish load balancer is supplemented with NGINX server as HTTPS proxy
  • Scalable NGINX PHP application servers with preconfigured automatic horizontal scaling to handle load spikes
  • MySQL DB Cluster with asynchronous master-slave replication to ensure high availability, fault tolerance and data security
  • Redis Sessions node to retain user session parameters
  • Redis Cache node for content cache storing
  • Elastic Data Storage node for media files

Magento Cluster Scaling

Fine tuning and customization: multi-cloud HA and DR, geo distributed load balancing, performance optimization, WAF, CDN and other required addons can be installed afterward on demand.

If manual setup is preferable you can discover the source code at the github repository.

Also underlying infrastructure plays the key role for highly loaded ecommerce projects. Choose your cloud infrastructure vendor carefully.

Licensed under: CC-BY-SA with attribution
Not affiliated with magento.stackexchange
scroll top