Question

I am working on some MapR programs. They are usually coded and tested on Apache hadoop on my local machine, and packaged jar (with dependencies) is uploaded onto our cluster running Cloudera CDH4 (v4.4.1). For both situation, I have different pom.xml files to make packages.

Now I am using Apache Avro to serialize data and current stable version 1.7.5 is involved. In local mode, I have avro-mapred pom.xml dependency

<dependency>
    <groupId>org.apache.avro</groupId>
    <artifactId>avro-mapred</artifactId>
    <version>1.7.5</version>
</dependency>

and it works well on Apache hadoop.

In cluster mode, for the pom.xml dependency, a classifier tag is appended as suggested by CDH4 doc:

<classifier>hadoop1</classifier>

But with neither hadoop1 or hadoop2, error occurs. For hadoop1 tag:

Error running child : java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
at org.apache.avro.mapreduce.AvroKeyOutputFormat.getRecordWriter(AvroKeyOutputFormat.java:87)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:597)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:444)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)

For hadoop2 tag,

Error running child : java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
at org.apache.avro.mapreduce.AvroKeyRecordWriter.<init>(AvroKeyRecordWriter.java:53)
at org.apache.avro.mapreduce.AvroKeyOutputFormat$RecordWriterFactory.create(AvroKeyOutputFormat.java:78)
at org.apache.avro.mapreduce.AvroKeyOutputFormat.getRecordWriter(AvroKeyOutputFormat.java:104)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:597)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:444)
at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
at org.apache.hadoop.mapred.Child.main(Child.java:262)

I am programming using new mapreduce interface in MapR1. I also guess its a conflict with the installed avro version and further contact with our cluster admin will be reached. Any ideas, guys?

Jamin

Was it helpful?

Solution

The problem is almost certainly that you are developing against a very different version of Hadoop than you are running against. CDH 4.4 comes in an "MR1" and "MR2" flavor, with newer "MR2" being the default. I think you are probably compiling against a Hadoop 1.x distribution? You don't need to compile against CDH libraries (although that's the best idea here), but if I'm right here, you will at least need to compile against Hadoop 2.x.

Your Avro is fine except that you then need to not specify "hadoop1" either.

Or, if you really intend to use MR1, you need to make sure you have actually set up an MR1 cluster in CDH 4.4. This means installed in the 'mapreduce' service instead of 'yarn' and using Maven artifacts with "...-mr1-..." in the name.

OTHER TIPS

           ` <dependency>
      <groupId>org.apache.avro</groupId>
      <artifactId>avro-mapred</artifactId>
      <version>${avro.version}</version>
      <classifier>hadoop2</classifier>
    </dependency>`

This did the magic!! Its the issue with hadoop2 and hadoop1

refer - https://issues.apache.org/jira/browse/AVRO-1170

Above was for cloudera

And for MapR amazon

` org.apache.avro avro-mapred ${avro.version}

          <avro.version>1.7.6</avro.version>
 <hadoop.version> 1.0.3-mapr-2.1.3.1</hadoop.version>`

These did it...Happy coding :)

This is a version confilct. The CDH supported Avro version is currently 1.7.3. I had this same issue with 1.7.5, which was solved by changing the version.

From the CDH docs: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH4/latest/CDH4-Installation-Guide/CDH4-Installation-Guide.html#../CDH4-Installation-Guide/cdh4ig_topic_26_5.html

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top