Question

in my mapper class i read the file line by line and perform some string operation on each line (i.e remove special characters from string) after that in reducer each line is stored in a new file.

Basically my reducer only create the file and write each line on new file nothing else so is that possible that my mapper creates the files rather than passing data to reducer. It means that I don't want the reducer and perform its operation in mapper. I don't know much about Mapreduce so any type of help would be appreciable

Was it helpful?

Solution

Yes what you want is very much possible, what you need is a map only job. i.e a job which only a mapper but no reducer.

The way you can achieve it is by setting the number of reducers to 0 in your driver class.

job.setNumReduceTasks(0);

A more detailed blog on this topic can be found here

To generate a file per input line Consider using the hdfs api directly from your mapper namely FileSystem and FileStatus

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top