Question

I need to implement a time series graph for my rails 4 app. Its not streaming data but I need the look and feel of this but without the animation. I started looking into the documentation and it seems kinda sparse (or Im an idiot :()

But I need some help in getting started with this project and cant seem to find much after googling this. There is a gem that someone created here but again it lacks any documentation.

The wiki talks of graphite and cube, libraries that serve up data, but these are python libraries. Why would I use them?

WHere should I start?

So far im my rails app I have created a "Visualization" scaffold. Each visualization has a name, description. I plan to have the "show" method render the graphic for a visualization.

The data Im showing are:

 x axis = time- scaled minute by minute 
 y axis = tv channels 
 metric to be shown = viewership stats (integers, decimals 2 sig figs)

data will be pulled from a rails + mysql db.

Where should I begin. Any help to get started would be very appreciated.

Thanks

Was it helpful?

Solution

Graphite goes 'easy' and fixed on storage.

Easy because the metrics can be stored with varying granularity. You can store data like- Datapoint every 10s for 1 week, Every 1min for 30days, Every 10min for 1year. This design costs you a ~hundredth of traditional memory designs. It isn't really a compromise on detail because you'd practically never care about an event 5 months back at a 10s granularity level. With that much time you'd focus on the trend, than the actual-value.

Fixed because every file takes a constant, one-time storage space. You will only run out of disks if you measure more things.

It is really simple and cheap to send metrics to graphite.

A single line of code like this- echo "yahoo.mysql.update.time 4 EPOCH" | nc 10.0.0.12 2003; is all it takes to send the metric to graphite. This means that every dev in the team does not need a working knowledge of Graphite's internals. Moreover, modules, programs, servers and domains can prefix() their identity which makes information really manageable at the other end. For example, this particular example metric, with a minimal setup would become- beta.front-layer.ip-10-0-0-139.crawler.yahoo.mysql.update.time.

You can send metrics in UDP, which makes it a fire-and-forget mechanism that does not stress the system by any degree. The process of profiling need not stress the system. Heisenberg'd smile here.

Retrieval from Graphite is possibly one of the best implementations out there

This would tell you of the various methods in which you can extract data in json, csv, svg, gif etc from Graphite's URL API. Also there are many multi-functional, plug-and-play front-ends.

Scalable

Graphite has a very small footprint. Talking numbers, i am doing 450K metrics per minute on an EC2 m1.large machine with 1000 PIOPS. 450K metrics per minute is a lot. (Though this seems to be the limit and i have plans of scaling the architecture horizontally)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top