Question

what's the difference between run a jar file with commands "hadoop jar " and "yarn -jar " ?

I've used the "hadoop jar" command on my MAC successfully but I want be sure that the execution is being correct and parallel on my four cores.

Thanks!!!

Was it helpful?

Solution

Short Answer

They are probably identical for you, but even if they aren't, they should both utilize your cluster to the best of its ability.


Longer Answer

The /usr/bin/yarn script sets up the execution environment so that all of the yarn commands can be run. The /usr/bin/hadoop script isn't quite as concerned about yarn specific functionality. However, if you have your cluster set up to use yarn as the default implementation of mapreduce (MRv2), then hadoop jar will probably act the same as yarn jar for a mapreduce job.

Either way you're probably fine, but you can always check the resource manager (or job tracker) web interface to see how your job is distributed across the cluster (whether it's a single node cluster or not)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top