Question

Whenever I try to add up the CPU utilization percentages from commands like top or mpstat and in particular the collectd service, I can't get to the exact 100% CPU utilization.

For example top results from a test server on Amazon EC2:

Cpu(s): 13.6%us, 31.6%sy,  0.0%ni, 53.2%id,  0.0%wa,  0.0%hi,  0.0%si,  1.7%st

No matter how I add up the percentages, I never quite get 100% CPU, certainly not in any logical way. Mostly it seems like rounding errors; 100.1% or 99.9%, but sometimes I end up with over 110%. This usually happens when steal is relatively high, e.g. one situation from collectd reported ~21.44% steal and ~88% idle, just those two are well over 100% already. I understand the ni (nice) is also counted in us (user), so I shouldn't add it, but that still doesn't work out.

Does anybody know how to add these up to 100% or how to interpret the exceptional cases that collectd sometimes reports?

Was it helpful?

Solution

collectd (and top, htop, vmstat or any other such utility) reports an average over an interval, and by nature of the kernel (from which these utilities query their statistics) not generally using floating point math and not necessarily trying to exhaustively account for everything, can't be 100% accurate. Sometimes it'll all add up to something less than 100%, sometimes more. It's not intended to be used for an audit, just a general indication of where time is being spent.

OTHER TIPS

I confirm that this has nothing to do with collectd, but with kernel accounting. This inaccuracy is particularily substancial on tickless systems, and/or throttling states.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top