Question

We've been struggling for some time to properly publish internal metrics into Amazon's CloudWatch service. We have a number of different types of internal metrics that we map into CloudWatch's MetricDatum class and publish.

Each of the MetricDatum instances has a double value and also a StatisticSet which accepts a sampleCount, sum, and minimum/maximum values. For counters such as a Httpd 200 page counter, it is more appropriate to use the StatisticSet and set the sampleCount and the sum to the value of the counter. If you look at the ELB stats for example, that is what Amazon does to publish them. This makes the sum, average and other graph views work correctly when you are graphing the result.

The problem is when the value of the counter is 0 because CloudWatch does not allow you to publish a StatisticSet with a sampleCount of 0. What ELB does is to not publish anything for that period which creates holes in the graph. This is a pain because you get INSUFFICIENT_DATA warnings when the counter is 0 for the time period. If you have notifications on ERROR and want to know when you transition back to OK, the INSUFFICIENT_DATA to OK alerts will keep you up all night.

You have 1 alarm in INSUFFICIENT DATA state in US East (N. Virginia) region.

Question: How do I properly publish CloudWatch metrics so that you don't see the INSUFFICIENT_DATA warnings but still use the sampleCount with metrics that have a value of 0.

Was it helpful?

Solution

Although you cannot publish a StatisticSet with a sampleCount of 0, you can publish it with an extremely small sampleCount since it is a double. We have found that a sampleCount of 0.000000001 seems to give the appearance of 0 on the graphs but it still fills in the holes in the graph appropriately and does not cause the INSUFFICIENT_DATA alarms to happen.

double sampleCount = numSamples;
// our values come in as value and numSamples but StatisticSet wants a sum
double sum = value * numSamples;
if (numSamples == 0) {
    // special case here, CloudWatch does not allow a 0 sample count so we have to
    // set it to be slightly more
    sampleCount = 0.000000001D;
    // but sum can be 0
}
StatisticSet statisticSet =
    new StatisticSet().withMinimum(min)
            .withMaximum(max)
            .withSampleCount(sampleCount)
            .withSum(sum);

As an aside, I've coded some of this logic into my SimpleMetrics library which is designed to easily track and publish metrics.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top