I'd get started with ganglia or munin--those will tell you about the resource utilization on different machines in your cluster.
X-trace is a fairly academic project that generates data about distributed transactions, latency and bottlenecks, and flow of control in distributed systems. Unfortunately, it's not really well supported currently.