Test if Hadoop is correctly working

Question 1

You can run the mapreduce hello world job. Note that your paths might be slightly different:

HADOOP_HOME is the dir, where you have hadoop installed.
exemplary test input file existing in <HADOOP_HOME>/input/file01
prepare dir structure in your hdfs:
- <HADOOP_HOME>/bin/hdfs dfs -mkdir /wordcount
- <HADOOP_HOME>/bin/hdfs dfs -mkdir /wordcount/input
- <HADOOP_HOME>/bin/hdfs dfs -mkdir /wordcount/output
put file01 file into hdfs:
- <HADOOP_HOME>/bin/hdfs dfs -put <HADOOP_HOME>/input/file01 /wordcount/input
go to the dir with examples jar:
- cd <HADOOP_HOME>/share/hadoop/mapreduce/lib-examples (in my case, the jar has name hadoop-mapreduce-examples-2.3.0.jar)
fire away the mapred job <HADOOP_HOME>/bin/hadoop jar ./hadoop-mapreduce-examples-2.3.0.jar wordcount /wordcount/input/file01 /wordcount/output/file01-output

The job should finish successfully, and you should see words from file01 counted and stored in the /wordcount/output/file01-output dir

<HADOOP_HOME>/bin/hdfs -cat /wordcount/output/file01-output/part-r-00000

Question 2

First find the examples jar for hadoop using

find /home -name hadoop-examples-1.2.1.jar

if it is present then see if "hadoop-core" jar is placed parallel to it or not. If both exist then follow the steps of simple word count through the site

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

if example and core jar is not present then download it and perform the same steps

Question 3

I would download the examples jar file or try to find it and run the Quasi Monte Carlo simulation. That's probably the easiest to run and most straightforward to see if its working.

Just run a

find . -name *examples*

in your hadoop install directory. Once you find that, just top your machines as it is running to see if they're getting the expected number of threads, load, etc.