The mapreduce program is not giving me any output.Could somebody have a look into it?

https://stackoverflow.com/questions/22816754

26-06-2023
|

Pergunta

I am not getting output in this program.When I am runnning this mapreduce program , I am not getting any result.

Inputfile: dict1.txt

apple,seo
apple,sev
dog,kukura
dog,kutta
cat,bilei
cat,billi

Output I want :

apple seo|sev
dog kukura|kutta
cat bilei|billi

Mapper class code :

package com.accure.Dict;

import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.Mapper;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reporter;


public class DictMapper extends MapReduceBase implements Mapper<Text,Text,Text,Text> {

      private Text word = new Text();
public void map(Text key,Text value,OutputCollector<Text,Text> output,Reporter reporter) throws IOException{

    StringTokenizer itr = new StringTokenizer(value.toString(),",");
          while (itr.hasMoreTokens())
                {

              System.out.println(key);
                    word.set(itr.nextToken());
                    output.collect(key, word);
            }
}



}

Reducer code :

package com.accure.Dict;

import java.io.IOException;
import java.util.Iterator;


import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.MapReduceBase;
import org.apache.hadoop.mapred.OutputCollector;
import org.apache.hadoop.mapred.Reducer;
import org.apache.hadoop.mapred.Reporter;

public class DictReducer extends MapReduceBase implements Reducer<Text, Text, Text, Text> {

                private Text result = new Text();

                public void reduce(Text key, Iterator<Text> values, OutputCollector<Text,Text> output,Reporter reporter) throws IOException {
                    String translations = "";

                while(values.hasNext()){
                        translations += "|" + values.next().toString();
                    }

                    result.set(translations);
                    output.collect(key,result);
            }



                }

Driver code :

package com.accure.driver;

import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.FileInputFormat;
import org.apache.hadoop.mapred.FileOutputFormat;
import org.apache.hadoop.mapred.JobClient;
import org.apache.hadoop.mapred.JobConf;
import org.apache.hadoop.mapred.KeyValueTextInputFormat;
import org.apache.hadoop.mapred.TextOutputFormat;

import com.accure.Dict.DictMapper;
import com.accure.Dict.DictReducer;
public class DictDriver {

    public static void main(String[] args) throws Exception{
        // TODO Auto-generated method stub


        JobConf conf=new JobConf();
        conf.setJobName("wordcount_pradosh");
        System.setProperty("HADOOP_USER_NAME","accure");

        conf.set("fs.default.name","hdfs://host2.hadoop.career.com:54310/");
        conf.set("hadoop.job.ugi","accuregrp");
        conf.set("mapred.job.tracker","host2.hadoop.career.com:54311");

        /*mapper and reduce class */
        conf.setMapperClass(DictMapper.class);
        conf.setReducerClass(DictReducer.class);

        /*This particular jar file has your classes*/
        conf.setJarByClass(DictMapper.class);

        Path inputPath= new Path("/myCareer/pradosh/input");
        Path outputPath=new Path("/myCareer/pradosh/output"+System.currentTimeMillis());


        /*input and output directory path */
        FileInputFormat.setInputPaths(conf,inputPath);
        FileOutputFormat.setOutputPath(conf,outputPath);

        conf.setMapOutputKeyClass(Text.class);
        conf.setMapOutputValueClass(Text.class);

        /*output key and value class*/
        conf.setOutputKeyClass(Text.class);
        conf.setOutputValueClass(Text.class);
        /*input and output format */

        conf.setInputFormat(KeyValueTextInputFormat.class); /*Here the file is a text file*/
        conf.setOutputFormat(TextOutputFormat.class);

        JobClient.runJob(conf);


    }

}

output log :

14/04/02 08:33:38 INFO mapred.JobClient: Running job: job_201404010637_0011
14/04/02 08:33:39 INFO mapred.JobClient:  map 0% reduce 0%
14/04/02 08:33:58 INFO mapred.JobClient:  map 50% reduce 0%
14/04/02 08:33:59 INFO mapred.JobClient:  map 100% reduce 0%
14/04/02 08:34:21 INFO mapred.JobClient:  map 100% reduce 16%
14/04/02 08:34:23 INFO mapred.JobClient:  map 100% reduce 100%
14/04/02 08:34:25 INFO mapred.JobClient: Job complete: job_201404010637_0011
14/04/02 08:34:25 INFO mapred.JobClient: Counters: 29
14/04/02 08:34:25 INFO mapred.JobClient:   Job Counters 
14/04/02 08:34:25 INFO mapred.JobClient:     Launched reduce tasks=1
14/04/02 08:34:25 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33692
14/04/02 08:34:25 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
14/04/02 08:34:25 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
14/04/02 08:34:25 INFO mapred.JobClient:     Launched map tasks=2
14/04/02 08:34:25 INFO mapred.JobClient:     Data-local map tasks=2
14/04/02 08:34:25 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=25327
14/04/02 08:34:25 INFO mapred.JobClient:   File Input Format Counters 
14/04/02 08:34:25 INFO mapred.JobClient:     Bytes Read=92
14/04/02 08:34:25 INFO mapred.JobClient:   File Output Format Counters 
14/04/02 08:34:25 INFO mapred.JobClient:     Bytes Written=0
14/04/02 08:34:25 INFO mapred.JobClient:   FileSystemCounters
14/04/02 08:34:25 INFO mapred.JobClient:     FILE_BYTES_READ=6
14/04/02 08:34:25 INFO mapred.JobClient:     HDFS_BYTES_READ=336
14/04/02 08:34:25 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=169311
14/04/02 08:34:25 INFO mapred.JobClient:   Map-Reduce Framework
14/04/02 08:34:25 INFO mapred.JobClient:     Map output materialized bytes=12
14/04/02 08:34:25 INFO mapred.JobClient:     Map input records=6
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce shuffle bytes=12
14/04/02 08:34:25 INFO mapred.JobClient:     Spilled Records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Map output bytes=0
14/04/02 08:34:25 INFO mapred.JobClient:     Total committed heap usage (bytes)=246685696
14/04/02 08:34:25 INFO mapred.JobClient:     CPU time spent (ms)=2650
14/04/02 08:34:25 INFO mapred.JobClient:     Map input bytes=61
14/04/02 08:34:25 INFO mapred.JobClient:     SPLIT_RAW_BYTES=244
14/04/02 08:34:25 INFO mapred.JobClient:     Combine input records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce input records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce input groups=0
14/04/02 08:34:25 INFO mapred.JobClient:     Combine output records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Physical memory (bytes) snapshot=392347648
14/04/02 08:34:25 INFO mapred.JobClient:     Reduce output records=0
14/04/02 08:34:25 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=2173820928
14/04/02 08:34:25 INFO mapred.JobClient:     Map output records=0

Solução

When reading input you are setting input format as : KeyValueTextInputFormat

This expects the Byte separator b/w key and value. In you input you key and value are separated by "," hence the whole text goes as key and value would be empty.

This is why it is not going into the below loop of your mapper:

while (itr.hasMoreTokens())
            {

          System.out.println(key);
                word.set(itr.nextToken());
                output.collect(key, word);
        }

You should tokenize your key and take the first split and key and second split as value.

This is evidenced in the logs : map Input Records : 6 but Map output records=0

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow