Question

I am writing a system performance script in bash. I want to compute the CPU usage in terms of percent. I have two implementations, one using awk and another one using bc. I would like to know which of the two versions is more efficient. Is it better to use awk or bc for float computations? Thanks!

Version #1 (Using bc)

CPU=$(mpstat 1 1 | grep "Average" | awk '{print $11}')
CPU=$(echo "scale=2;(100-$CPU)" | bc -l)
echo $CPU

Version #2 (Using awk)

CPU=$(mpstat 1 1 | grep "Average" | awk '{idle = $11} {print 100 - idle}')
echo $CPU
Was it helpful?

Solution

Since the processing time of both is going to be tiny, the version that spawns the least amount of processes and subshells is going to be "more efficient".

That's your second example.

But you can make it even simpler by eliminating the grep:

CPU=$(mpstat 1 1 | awk '/Average/{print 100 - $11}')

OTHER TIPS

In version 1, why do you need 2nd line? Why can't you do it from 1st line itself? I am asking because, 1st version is grep+awk+bc; 2nd example is grep+awk. So the comparison is not valid, I think.

For using only bc, without awk, try this:

CPU=$(mpstat 1 1 | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc )

thanks all for educating me on awk/bc!
did the benchmark (in hopefully more proper way):
tl;dr: awk wins

semi-long story:
3 times 1000 runs awk averages to 2.081333s on my system while bc averages to 3.460333s

full story:

[me@thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average:     all    5.05    0.00    6.57    0.51    0.00    0.00    0.00    0.00   87.88" | awk '/Average/ {print 100 - $11}' >/dev/null ; done

real    0m1.922s
user    0m0.320s
sys     0m1.308s
[me@thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average:     all    5.05    0.00    6.57    0.51    0.00    0.00    0.00    0.00   87.88" | awk '/Average/{print 100 - $11}' >/dev/null ; done

real    0m2.124s
user    0m0.370s
sys     0m1.368s
[me@thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average:     all    5.05    0.00    6.57    0.51    0.00    0.00    0.00    0.00   87.88" | awk '/Average/{print 100 - $11}' >/dev/null ; done

real    0m2.198s
user    0m0.412s
sys     0m1.383s

[me@thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average:     all    5.05    0.00    6.57    0.51    0.00    0.00    0.00    0.00   87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done

real    0m3.799s
user    0m0.691s
sys     0m3.059s
[me@thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average:     all    5.05    0.00    6.57    0.51    0.00    0.00    0.00    0.00   87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done

real    0m3.545s
user    0m0.604s
sys     0m2.801s
[me@thebox tmp]$ time for i in `seq 1 1000` ; do echo "Average:     all    5.05    0.00    6.57    0.51    0.00    0.00    0.00    0.00   87.88" | grep Average | { read -a P; echo 100 - ${P[10]}; } | bc >/dev/null ; done

real    0m3.037s
user    0m0.602s
sys     0m2.626s
[me@thebox tmp]$

without further tracing I believe this is related to the overhead of forking more processes when using bc.

I did the following benchmark:

#!/bin/bash

count=0
tic="$(date +%s)"
while [ $count -lt 50 ]
do
mpstat 1 1 | awk '/Average/{print 100 - $11}'
count=$(($count+1))
done
toc="$(date +%s)"
sec="$(expr $toc - $tic)"

count=0
tic="$(date +%s)"
while [ $count -lt 50 ]
do
CPU=$(mpstat 1 1 | grep "Average" | awk '{print $11}')
echo "scale=2;(100-$CPU)" | bc -l
count=$(($count+1))
done
toc="$(date +%s)"
sec1="$(expr $toc - $tic)"

echo "Execution Time awk: "$sec
echo "Execution Time bc: "$sec1

Both execution times were the same... 50 seconds. Apparently it does not make any difference.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top