Question

As per my understanding from blogs, Yarn (mapred2) is faster or smarter than hadoop's mapreduce. If its true, is there way to configure Hive to use Yarn/Mapred2 without any complications to improve performance or to increase utilization of resources?

Was it helpful?

Solution

Hive runs on Yarn's mapred out-of-the-box.

But running an old Hive on Yarn is not going to be a earth shattering experience, you'll measure probably the same times. What you want is to get latest Hive developments and improvements (eg. ORC and Vectorization), and perhaps try out running Hive on Tez. I recommend reading about Stinger and go over this deployment guide.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top