Ganglia - RRD(round robin database) scalability

Question 1

RRD is designed to automatically blur (average out) your data over time, such that total size of database stays roughly the same, even as new data continuously arrives.

So, it is only good if you want some historical data and are willing to lose precision over time.

In other words, you cannot really compare RRD to standard SQL databases or to Bigtable, because standard SQL and NoSQL databases all store data precisely - you will read exactly what was written.

With RRDtool, however, there is no such guarantee. But its speed makes it attractive solution for all kinds of monitoring solutions, where only most recent data matters.

Question 2

The built-in consolidation feature of rrdtool is configurable, so depending on your disk space there is no limit to the amount of high precision data you can store with rrdtool. also due to its design, rrdtool databases will never have to be vacuumed or otherwise maintained, so that you can grow the setup to staggering sizes. Obviously you need enough memory and fast disks for rrdtool to work with big data, but this is the same with any large data step.

Some people get confused about rrdtools abilities due to the fact that you can also run it on a tiny embedded system, and when these people start logging gigabytes worth of data on an old pc from the attic and find that it does not cope, they wonder ...