سؤال

I tried to capture the disk I/O and network I/O of hadoop tasks(mapper and reducer), namely, instant bandwidth, accumulated traffic, source address and destination address. I found two popular monitoring tools for Hadoop: Ganglia (usually combined with Nagios) and X-Trace. Ganglia was brought up in 2004 by UC Berkeley and X-Trace was developed in 2007 also by UC Berkeley.

Any suggestion as to the pros and cons of these two tools is appreciated.

هل كانت مفيدة؟

المحلول

I'd get started with ganglia or munin--those will tell you about the resource utilization on different machines in your cluster.

X-trace is a fairly academic project that generates data about distributed transactions, latency and bottlenecks, and flow of control in distributed systems. Unfortunately, it's not really well supported currently.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top