Question

I have the following Mapper

private Text sentiment = new Text();

public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter)
        throws IOException {

    String allPages = value.toString();
    String[] tokens = allPages.split(":::");

    for(int i=0;i<(tokens.length-1);i++)
    {
        String articleID="";
        sentiment.set(tokens[i].trim());
        articleID = tokens[0].trim();
        System.out.println("articleID "+articleID);
        Text articleIDValue = new Text(articleID); 
        output.collect(sentiment,articleIDValue);
    }
    String line = "";
    for(int j=1;j<tokens.length;j++){
        line = line + " "+tokens[j];
        System.out.println("line.... "+line);
    }
    Text lineText = new Text(line.trim());
    output.collect(new Text(tokens[0]),lineText);

}

Sample Input: abc ::: In a market that's awash in tech IPOs, this one is different. should store keyValue pair as (abc,In a market that's awash in tech IPOs, this one is different.)

Right now this stores as (abc,abc).. Where am I going wrong?

Was it helpful?

Solution

I suspect you're seeing the result of the first collect() call in which both key and value are set from tokens[0] ("abc").

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top