Domanda

I am using MapSide join inorder to get 2 different files in my MR jobs.

Input File 1:
0,5.3
26,4.9
54,4.9
.
.
.

InputFile 2:
0   Anju,,3.6,IT,A,1.6,0.3
26  Remya,,3.3,EEE,B,1.6,0.3
54  akhila,,3.3,IT,C,1.3,0.3

My intension is to Replace as below

Anju,5.3,3.6,IT,A,1.6,0.3
Remya,4.9,3.3,EEE,B,1.6,0.3
akhila,4.9,3.3,IT,C,1.3,0.3

What I did is

Inorder to get 2 files as input I used 2 mappers(MultipleInput).

The first column of 2 files is the file offset.

In 1 st map I emitted first col as key(Logwritable offset) and rest as value of 1 st file In 2 st map also I emitted first col as key(Logwritable offset) and rest as value of 2 nd file

So in reducer I am able to get

Reducer
key 0
value 5.3
value Anju,S,3.6,IT,A,1.6,0.3

Reducer
key 26
value Remya,,3.3,EEE,B,1.6,0.3
value 4.9

Reducer
key 54
value 4.9
value akhila,,3.3,IT,C,1.3,0.3

How will I replace value any idea?

Whether my approach is right or should I follow any alterbative way? Please suggest.

È stato utile?

Soluzione

You can make a replacement with this code:

String result = null;
String replacement = null;
for (Text value: values) {
    String valueStr = value.toString();
    if (valueStr.contains(",,")) {
        result = valueStr;
    } else {
        replacement = valueStr;
    }
}
if (result == null || replacement == null) {
    return;
}
result = result.replaceFirst(",,", "," + replacement + ",");
// write result

But it is not MapSide join. To do MapSide join you should read file with replacements (InputFile 1) in every mapper (in setup phase) and then join this data with InputFile 2 in map phase. Example:

private Map < Integer, Double > replacements;

@Override
protected void setup(Context context) throws IOException, InterruptedException {
    replacements = new HashMap < Integer, Double > ();
    // read FileInput 1 to replacements
    // ...
}

@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
    String[] parts = value.toString().split("\t"); // 0   Anju,,3.6,IT,A,1.6,0.3
    Integer id = Integer.parseInt(parts[0]);
    if (!replacements.containsKey(id)) {
        return; // can't replace
    }
    Double replacement = replacements.get(id);
    String result = parts[1].replaceFirst(",,", "," + replacement + ",");
    // write result to context
}
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top