Cannot run the job on hadoop cluster. only runs using LocalJobRunner

https://stackoverflow.com/questions/23122744

05-07-2023
|

Question

I have submitted a MR job using hadoop jar command with the following command on CDH5 Beta 2

hadoop jar ./hadoop-examples-0.0.1-SNAPSHOT.jar com.aravind.learning.hadoop.mapred.join.ReduceSideJoinDriver tech_talks/users.csv tech_talks/ratings.csv tech_talks/output/ReduceSideJoinDriver/

I've also tried providing the fs name and job tracker url explicitly as below without any success

hadoop jar ./hadoop-examples-0.0.1-SNAPSHOT.jar com.aravind.learning.hadoop.mapred.join.ReduceSideJoinDriver -Dfs.default.name=hdfs://abc.com:8020 -Dmapreduce.job.tracker=x.x.x.x:8021 tech_talks/users.csv tech_talks/ratings.csv tech_talks/output/ReduceSideJoinDriver/

The job runs successfully but is using the LocalJobRunner instead of submitting to the cluster. The output is written to HDFS and is correct. Not sure what I am doing wrong here so appreciate your input. I've also tried explicitly specifying the fs and job tracker as below but have the same result

14/04/16 20:35:44 INFO Configuration.deprecation: session.id is deprecated. Instead, use dfs.metrics.session-id
14/04/16 20:35:44 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
14/04/16 20:35:45 WARN mapreduce.JobSubmitter: No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
14/04/16 20:35:45 INFO input.FileInputFormat: Total input paths to process : 2
14/04/16 20:35:45 INFO mapreduce.JobSubmitter: number of splits:2
14/04/16 20:35:46 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_local1427968352_0001
14/04/16 20:35:46 WARN conf.Configuration: file:/tmp/hadoop-ird2/mapred/staging/ird21427968352/.staging/job_local1427968352_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/04/16 20:35:46 WARN conf.Configuration: file:/tmp/hadoop-ird2/mapred/staging/ird21427968352/.staging/job_local1427968352_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
14/04/16 20:35:46 WARN conf.Configuration: file:/tmp/hadoop-ird2/mapred/local/localRunner/ird2/job_local1427968352_0001/job_local1427968352_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/04/16 20:35:46 WARN conf.Configuration: file:/tmp/hadoop-ird2/mapred/local/localRunner/ird2/job_local1427968352_0001/job_local1427968352_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
14/04/16 20:35:46 INFO mapreduce.Job: The url to track the job: http://localhost:8080/
14/04/16 20:35:46 INFO mapreduce.Job: Running job: job_local1427968352_0001
14/04/16 20:35:46 INFO mapred.LocalJobRunner: OutputCommitter set in config null
14/04/16 20:35:46 INFO mapred.LocalJobRunner: OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
14/04/16 20:35:46 INFO mapred.LocalJobRunner: Waiting for map tasks
14/04/16 20:35:46 INFO mapred.LocalJobRunner: Starting task: attempt_local1427968352_0001_m_000000_0
14/04/16 20:35:46 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
14/04/16 20:35:46 INFO mapred.MapTask: Processing split: hdfs://...:8020/user/ird2/tech_talks/ratings.csv:0+4388258
14/04/16 20:35:46 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/04/16 20:35:46 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/04/16 20:35:46 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/04/16 20:35:46 INFO mapred.MapTask: soft limit at 83886080
14/04/16 20:35:46 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/04/16 20:35:46 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/04/16 20:35:47 INFO mapreduce.Job: Job job_local1427968352_0001 running in uber mode : false
14/04/16 20:35:47 INFO mapreduce.Job:  map 0% reduce 0%
14/04/16 20:35:48 INFO mapred.LocalJobRunner:
14/04/16 20:35:48 INFO mapred.MapTask: Starting flush of map output
14/04/16 20:35:48 INFO mapred.MapTask: Spilling map output
14/04/16 20:35:48 INFO mapred.MapTask: bufstart = 0; bufend = 6485388; bufvoid = 104857600
14/04/16 20:35:48 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 24860980(99443920); length = 1353417/6553600
14/04/16 20:35:49 INFO mapred.MapTask: Finished spill 0
14/04/16 20:35:49 INFO mapred.Task: Task:attempt_local1427968352_0001_m_000000_0 is done. And is in the process of committing
14/04/16 20:35:49 INFO mapred.LocalJobRunner: map
14/04/16 20:35:49 INFO mapred.Task: Task 'attempt_local1427968352_0001_m_000000_0' done.
14/04/16 20:35:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1427968352_0001_m_000000_0
14/04/16 20:35:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1427968352_0001_m_000001_0
14/04/16 20:35:49 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
14/04/16 20:35:49 INFO mapred.MapTask: Processing split: hdfs://...:8020/user/ird2/tech_talks/users.csv:0+186304
14/04/16 20:35:49 INFO mapred.MapTask: Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
14/04/16 20:35:49 INFO mapred.MapTask: (EQUATOR) 0 kvi 26214396(104857584)
14/04/16 20:35:49 INFO mapred.MapTask: mapreduce.task.io.sort.mb: 100
14/04/16 20:35:49 INFO mapred.MapTask: soft limit at 83886080
14/04/16 20:35:49 INFO mapred.MapTask: bufstart = 0; bufvoid = 104857600
14/04/16 20:35:49 INFO mapred.MapTask: kvstart = 26214396; length = 6553600
14/04/16 20:35:49 INFO mapred.LocalJobRunner:
14/04/16 20:35:49 INFO mapred.MapTask: Starting flush of map output
14/04/16 20:35:49 INFO mapred.MapTask: Spilling map output
14/04/16 20:35:49 INFO mapred.MapTask: bufstart = 0; bufend = 209667; bufvoid = 104857600
14/04/16 20:35:49 INFO mapred.MapTask: kvstart = 26214396(104857584); kvend = 26192144(104768576); length = 22253/6553600
14/04/16 20:35:49 INFO mapred.MapTask: Finished spill 0
14/04/16 20:35:49 INFO mapred.Task: Task:attempt_local1427968352_0001_m_000001_0 is done. And is in the process of committing
14/04/16 20:35:49 INFO mapred.LocalJobRunner: map
14/04/16 20:35:49 INFO mapred.Task: Task 'attempt_local1427968352_0001_m_000001_0' done.
14/04/16 20:35:49 INFO mapred.LocalJobRunner: Finishing task: attempt_local1427968352_0001_m_000001_0
14/04/16 20:35:49 INFO mapred.LocalJobRunner: map task executor complete.
14/04/16 20:35:49 INFO mapred.LocalJobRunner: Waiting for reduce tasks
14/04/16 20:35:49 INFO mapred.LocalJobRunner: Starting task: attempt_local1427968352_0001_r_000000_0
14/04/16 20:35:49 INFO mapred.Task:  Using ResourceCalculatorProcessTree : [ ]
14/04/16 20:35:49 INFO mapred.ReduceTask: Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@5116331d
14/04/16 20:35:49 INFO reduce.MergeManagerImpl: MergerManager: memoryLimit=652528832, maxSingleShuffleLimit=163132208, mergeThreshold=430669056, ioSortFactor=10, memToMemMergeOutputsThreshold=10
14/04/16 20:35:49 INFO reduce.EventFetcher: attempt_local1427968352_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
14/04/16 20:35:49 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1427968352_0001_m_000001_0 decomp: 220797 len: 220801 to MEMORY
14/04/16 20:35:49 INFO reduce.InMemoryMapOutput: Read 220797 bytes from map-output for attempt_local1427968352_0001_m_000001_0
14/04/16 20:35:49 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 220797, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->220797
14/04/16 20:35:49 INFO reduce.LocalFetcher: localfetcher#1 about to shuffle output of map attempt_local1427968352_0001_m_000000_0 decomp: 7162100 len: 7162104 to MEMORY
14/04/16 20:35:49 INFO reduce.InMemoryMapOutput: Read 7162100 bytes from map-output for attempt_local1427968352_0001_m_000000_0
14/04/16 20:35:49 INFO reduce.MergeManagerImpl: closeInMemoryFile -> map-output of size: 7162100, inMemoryMapOutputs.size() -> 2, commitMemory -> 220797, usedMemory ->7382897
14/04/16 20:35:49 INFO reduce.EventFetcher: EventFetcher is interrupted.. Returning
14/04/16 20:35:49 INFO mapred.LocalJobRunner: 2 / 2 copied.
14/04/16 20:35:49 INFO reduce.MergeManagerImpl: finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs
14/04/16 20:35:49 INFO mapred.Merger: Merging 2 sorted segments
14/04/16 20:35:49 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 7382885 bytes
14/04/16 20:35:50 INFO reduce.MergeManagerImpl: Merged 2 segments, 7382897 bytes to disk to satisfy reduce memory limit
14/04/16 20:35:50 INFO reduce.MergeManagerImpl: Merging 1 files, 7382899 bytes from disk
14/04/16 20:35:50 INFO reduce.MergeManagerImpl: Merging 0 segments, 0 bytes from memory into reduce
14/04/16 20:35:50 INFO mapred.Merger: Merging 1 sorted segments
14/04/16 20:35:50 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 7382889 bytes
14/04/16 20:35:50 INFO mapred.LocalJobRunner: 2 / 2 copied.
14/04/16 20:35:50 INFO Configuration.deprecation: mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
14/04/16 20:35:50 INFO mapreduce.Job:  map 100% reduce 0%
14/04/16 20:35:51 INFO mapred.Task: Task:attempt_local1427968352_0001_r_000000_0 is done. And is in the process of committing
14/04/16 20:35:51 INFO mapred.LocalJobRunner: 2 / 2 copied.
14/04/16 20:35:51 INFO mapred.Task: Task attempt_local1427968352_0001_r_000000_0 is allowed to commit now
14/04/16 20:35:51 INFO output.FileOutputCommitter: Saved output of task 'attempt_local1427968352_0001_r_000000_0' to hdfs://...:8020/user/ird2/tech_talks/output/ReduceSideJoinDriver/_temporary/0/task_local1427968352_0001_r_000000
14/04/16 20:35:51 INFO mapred.LocalJobRunner: reduce > reduce
14/04/16 20:35:51 INFO mapred.Task: Task 'attempt_local1427968352_0001_r_000000_0' done.
14/04/16 20:35:51 INFO mapred.LocalJobRunner: Finishing task: attempt_local1427968352_0001_r_000000_0
14/04/16 20:35:51 INFO mapred.LocalJobRunner: reduce task executor complete.
14/04/16 20:35:52 INFO mapreduce.Job:  map 100% reduce 100%
14/04/16 20:35:52 INFO mapreduce.Job: Job job_local1427968352_0001 completed successfully
14/04/16 20:35:52 INFO mapreduce.Job: Counters: 38
        File System Counters
                FILE: Number of bytes read=14767932
                FILE: Number of bytes written=29952985
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=13537382
                HDFS: Number of bytes written=2949787
                HDFS: Number of read operations=28
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=5
        Map-Reduce Framework
                Map input records=343919
                Map output records=343919
                Map output bytes=6695055
                Map output materialized bytes=7382905
                Input split bytes=272
                Combine input records=0
                Combine output records=0
                Reduce input groups=5564
                Reduce shuffle bytes=7382905
                Reduce input records=343919
                Reduce output records=5564
                Spilled Records=687838
                Shuffled Maps =2
                Failed Shuffles=0
                Merged Map outputs=2
                GC time elapsed (ms)=92
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
                Total committed heap usage (bytes)=1416101888
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=4574562
        File Output Format Counters
                Bytes Written=2949787

Driver code

public class ReduceSideJoinDriver extends Configured implements Tool
{
    @Override
    public int run(String[] args) throws Exception
    {
        if (args.length != 3)
        {
            System.err.printf("Usage: %s [generic options] <input> <output>\n", getClass().getSimpleName());
            ToolRunner.printGenericCommandUsage(System.err);
            return -1;
        }

        Path usersFile = new Path(args[0]);
        Path ratingsFile = new Path(args[1]);

        Job job = Job.getInstance(getConf(), "Aravind - Reduce Side Join");

        job.getConfiguration().setStrings(usersFile.getName(), "user");
        job.getConfiguration().setStrings(ratingsFile.getName(), "rating");

        job.setInputFormatClass(TextInputFormat.class);
        job.setOutputFormatClass(TextOutputFormat.class);
        job.setMapOutputKeyClass(IntWritable.class);
        job.setMapOutputValueClass(TagAndRecord.class);

        TextInputFormat.addInputPath(job, usersFile);
        TextInputFormat.addInputPath(job, ratingsFile);

        TextOutputFormat.setOutputPath(job, new Path(args[2]));

        job.setMapperClass(ReduceSideJoinMapper.class);
        job.setReducerClass(ReduceSideJoinReducer.class);
        job.setOutputKeyClass(IntWritable.class);
        job.setOutputValueClass(Text.class);

        return job.waitForCompletion(true) ? 0 : 1;
    }

    public static void main(String args[]) throws Exception
    {
        int exitCode = ToolRunner.run(new Configuration(), new ReduceSideJoinDriver(), args);
        System.exit(exitCode);
    }
}

Solution 2

Apparently, you can only submit a hadoop job from the node designated as the gateway node. Everything is working once I submitted the job from the gateway node.

OTHER TIPS

Make sure you have valid following configuration files in hadoop classpath. By default configuration files are taken from the directory /etc/hadoop/conf. This activity should be performed a part of hadoop client node setup.

mapred-site.xml
yarn-site.xml
core-site.xml

If the above mentioned configuration files are empty. You got to pupulate the above files with right properties. Population can be achieved in two ways

In Cloudera Manager when click on service yarn, in action portion, there is an option Deploy client configuration along with start,stop etc. Use that option to deploy the client configuration.

Sometimes above option maynot work if the node is not managed by CM and yarn gateway is not configured on the node. use the option Download client configuration instead of deploy client Configuration. Extract the downloaded zip configuration file(above files) and copy those files to the location /etc/hadoop/conf manually.

For executing the jar either hadoop or yarn can be used.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow