Capacity planning - knowing when to scale?

https://stackoverflow.com/questions/13099312

14-07-2021
|

Question

I want to understand when is my system under load (memory and CPU) when should I plan to scale.

Memory

I am using an ec2-instance. I have multiple processes running. They consume memory between 80-90% all the time. Should I worry or should I be happy that I am utilizing maximum of the available.

What should be memory consumption and under what circumstances I should worry about scaling?

CPU

I have another ec2-instance that runs some other processes. Most of the times the system cpu utilization is only 18-20% but at time for some of the processes it jumps to 90-100%.

Can anything might go wrong or is that only the processes might get slow due to non availability of cpu cycle and in some time they will get complete. Also any new process will wait for the availability of cpu cycles.

Can anything go wrong?

Basically I want to understand what is the scenario and what are the values when one should consider to scale up (vertically or horizontally)

In line answers or pointers to read, anything is appreciated.

Solution

First of all: you have to define the thresholds when to scale yourself. This mainly has to do with some factors that you have in your quality or stability guidelines and in your application. There is hardly any general rule for this. Here are some points to consider:

Some applications can run fine with 100% CPU usage (as long as there are no other jobs on this machine), and some applications might need to scale when using a 80% threshold for example. The same goes for memory.
Think about if you have some critical tasks that must be finished in a specific time. If so, you have to think about getting enough CPU and/or memory for them to do their job.
Observe and measure your system data throughout the whole time. I suggest to have a system like munin to show your performance data (and its changes) over time. Interesting points to measure are system load, cpu usage, memory consumption, i/o service time etc.
Try to get an idea what limits your application. For example if you have a lot of CPU-intensive tasks, CPU is your limit. If you have a lot of I/O to do, have an eye on the I/O stats, delay times etc.

To sum up: the need for scaling depends on your application. Get to know it better in terms of system resource usage. If you have a monitoring system set up, you can watch your system performance over time.

A good read is "The Art of Capacity Planning". Also if you google a bit about "capacity planning", you will find some more points.

OTHER TIPS

It's far easier to measure performance than predict it from resource usage, so set up a little probe with jmeter or wget, and test your system roughly hourly, to detect slowdowns.

When preparing to do the regular probing, do a test on a single target system and find out how many user it takes to put it into slowdown. That's how many you should avoid in production, by adding instances.

Only then measure the resource usage to see what's the root cause of the problem, to see if you can specify more of it for your instances.

--dave

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow