Question

I have a strange exception when trying to run a MapReduce job on a Hadoop cluster. What is interesting here is that I can access the HDFS, but am unable to run a job.

UriBuilder uriBuilder = new UriBuilder("192.168.16.132");
uriBuilder.Port = 8021;//8082;
var hadoop = Hadoop.Connect(uriBuilder.Uri, "username", "password");
hadoop.StorageSystem.MakeDirectory("user/username/test"); //This works
//establish job configuration
HadoopJobConfiguration myConfig = new HadoopJobConfiguration();
myConfig.InputPath = "/user/username/input";
myConfig.OutputFolder = "/user/username/output";
try
{
    //connect to cluster
    MapReduceResult jobResult = hadoop.MapReduceJob.Execute<MySimpleMapper, MySimpleReducer>(myConfig); //This does not work and produces an error: The remote name could not be resolved
    //write job result to console
    int exitCode = jobResult.Info.ExitCode;
    string exitStatus = "Failure";
    if (exitCode == 0) exitStatus = "Success";
        exitStatus = exitCode + " (" + exitStatus + ")";
    Console.WriteLine();
    Console.Write("Exit Code = " + exitStatus);
    Console.Read();
}
catch (Exception exc)
{
    //Error sending request.
}

I am using Hortonworks sandbox for testing, if it makes any difference. The exact error is: "The remote name could not be resolved: 'sandbox'".

Could anyone explain why this is happening and what I could do to fix it?

EDIT: I have fixed the issue by adding the IP of the Hadoop cluster to the hosts file, however now I am getting the following exception: "Response status code does not indicate success: 500 (Server Error)."

Was it helpful?

Solution

It turns out that the server was not on Windows Azure, but an Apache implementation. The protocols were incompatible. The HDFS protocol is the same for both implementations, so this could work. However, the Map/Reduce framework was unsupported.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top