Domanda

I have a few devices that emit time series data:

[deviceID],[time],[value]

I am using graphite to keep track of this data but the question applies to other databases as well.

I have defined my data retention/precision to be 5 seconds - so each device will only have one value per 5 seconds which is the average of all the observations it had made during this period. For example if these are the real measurements:

device1    1/1/2012 08:00:00    12
device1    1/1/2012 08:00:01    10
device2    1/1/2012 08:00:01    2
device1    1/1/2012 08:00:02    14

Then the data saved will be:

device1    1/1/2012 08:00:00    12
device2    1/1/2012 08:00:00    2

How could I query for the average value across both devices in this time period? I can't just take their average over the saved data (=7) since it is biased down because it does not consider that device1 had more measurements. Do I need to keep track of the avg for every device pair/trio? Maybe it is best not to do aggregations at all and get maximum flexibility? Or is it accepted to not allow such cross-device queries if this is just a nice to have feature?

È stato utile?

Soluzione

Have you considered calculating a weighted mean?

A simple example would be like this:

(No of measurements of d1)*d1 measurement + (No of measurements of d2)*d2 measurement
_____________________________________________________________________________________
                   Total number of measurements of d1 & d2

This measurement will take into account the number of measurements of each device and so will not be biased downwards.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top