My 2 cents.
Single node setup (standalone setup)
By default, Hadoop is configured to run in a non-distributed or standalone mode, as a single Java process. There are no daemons running and everything runs in a single JVM instance. HDFS is not used.
You don't have to do anything as far as configuration is concerned, except the JAVA_HOME
. Just download the tarball, unzip it, and you are good to go.
Pseudo-distributed mode
The Hadoop daemons run on a local machine, thus simulating a cluster on a small scale. Different Hadoop daemons run in different JVM instances, but on a single machine. HDFS is used instead of local FS.
As far as pseudo-distributed setup is concerned, you need to set at least following 2 properties along with JAVA_HOME
:
fs.default.name
incore-site.xml
.mapred.job.tracker
inmapred-site.xml
.
You could have multiple datanodes and tasktrackers, but that doesn't make much sense on a single machine.
HTH