문제

in my mapper class i read the file line by line and perform some string operation on each line (i.e remove special characters from string) after that in reducer each line is stored in a new file.

Basically my reducer only create the file and write each line on new file nothing else so is that possible that my mapper creates the files rather than passing data to reducer. It means that I don't want the reducer and perform its operation in mapper. I don't know much about Mapreduce so any type of help would be appreciable

도움이 되었습니까?

해결책

Yes what you want is very much possible, what you need is a map only job. i.e a job which only a mapper but no reducer.

The way you can achieve it is by setting the number of reducers to 0 in your driver class.

job.setNumReduceTasks(0);

A more detailed blog on this topic can be found here

To generate a file per input line Consider using the hdfs api directly from your mapper namely FileSystem and FileStatus

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top