Domanda

I tried to capture the disk I/O and network I/O of hadoop tasks(mapper and reducer), namely, instant bandwidth, accumulated traffic, source address and destination address. I found two popular monitoring tools for Hadoop: Ganglia (usually combined with Nagios) and X-Trace. Ganglia was brought up in 2004 by UC Berkeley and X-Trace was developed in 2007 also by UC Berkeley.

Any suggestion as to the pros and cons of these two tools is appreciated.

È stato utile?

Soluzione

I'd get started with ganglia or munin--those will tell you about the resource utilization on different machines in your cluster.

X-trace is a fairly academic project that generates data about distributed transactions, latency and bottlenecks, and flow of control in distributed systems. Unfortunately, it's not really well supported currently.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top