I have a class:

 class Class1 implements Writable{
       int intField;
       double doubleField;
       Class2 refToClass2;

       public void readField(DataInput in){...}
       public void write(DataOutput out){...}


 class Class2 implements Serializable, Writable{
     ....
 }

Hadoop throws this error on the reducer side, when using Class1 as a output value:

 java.lang.NullPointerException
at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
at org.apache.hadoop.io.SequenceFile$Writer.init(SequenceFile.java:961)
at org.apache.hadoop.io.SequenceFile$Writer.<init>(SequenceFile.java:892)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:393)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:354)
at org.apache.hadoop.io.SequenceFile.createWriter(SequenceFile.java:476)
at org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat.getRecordWriter(SequenceFileOutputFormat.java:61)
at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:569)

My intuition tells me that the problem is related to Class1 or more likely to Class2 which implements both Serializable and Writable.

Any ideas ?

UPDATE:

I localized the problem: the problem is Class1 which, now, I've changed to implement only the Writable (not also the Serializable). I've also changed it in the sense that it doesn't contain a reference to Class2 anymore. I still get the same error. If I replace Class1 with another Writable implementation as an output value, it works! Why ??

有帮助吗?

解决方案

The problem was that I was making a stupid mistake: I was not updating a jar. So, basically Class1 was not implementing the Writable interface in the old (in use) jar.

As a general observation: the error specified in the OP has as underlying cause the fact that HADOOP can't find a Serializer for a specific type which you're trying to serialize (being directly or indirectly, e.g. by using that type as an output key/value). Hadoop cannot find a Serilizer for one of the 2 reasons:

  1. your type is not serializable (i.e. it doesn't implement Writable or Serializable)
  2. There is no Serializer available to Hadoop for the type of serialization your type implements (e.g.: your type implements Writable but hadoop for one reason or another cannot use the org.apache.hadoop.io.serializer.WritableSerialization class)
许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top