Question

At work we use gradle on a Scalding project and I'm trying to come up with the simplest job to get the hand out of the stack.

My class looks as :

package org.playground

import com.twitter.scalding._

class readCsv(args: Args) extends Job(args) {

    val csv:Csv = Csv(args("input"), ("firstName", "lastName"))
    println(csv)
}

and lives in playground/src/org/playground/readCsv.scala. My build script looks like this:

apply plugin: 'scala'

archivesBaseName = 'playground'

mainClassName = 'org.playground.readCsv'

repositories {
  mavenLocal()
  mavenCentral()
  maven{ 
        url 'http://conjars.org/repo/'
        artifactUrls  'http://clojars.org/repo/‎'
        artifactUrls  'http://maven.twttr.com/'
    }
}

dependencies {

    compile 'org.scala-lang:scala-compiler:2.9.2'
    compile 'org.scala-lang:scala-library:2.9.2'
    compile 'bixo:bixo-core:0.9.1'
    compile 'org.apache.hadoop:hadoop-core:1.2.1'
    compile 'com.twitter:scalding_2.9.2:0.8.1' 
    compile 'cascading:cascading-core:2.1.6'
    compile 'cascading:cascading-hadoop:2.1.6'
    testCompile 'org.testng:testng:6.8.7'
    testCompile 'org.scala-tools.testing:specs:1.6.2.2_1.5.0'
}

test {
    useTestNG()
}

jar {
  description = "Assembles a Hadoop-ready JAR file"
  manifest {
    attributes( "Main-Class": "org.playground.readCsv" )
  }
}

This compiles and builds successfully but trying to run the jar throws this error:

$ java -jar build/libs/playground.jar 
Exception in thread "main" java.lang.NoClassDefFoundError: org/playground/readCsv
Caused by: java.lang.ClassNotFoundException: org.playground.readCsv
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

My educated guess is that having the job extend from Job fails to conform some convention and doesn't look like a valid Main-Class, but I won't expect it to complain about not finding it.

Other possibility is that running it as java -jar jarname is incorrect and I just need run it with hadoop or something along those lines.

Anyway and just to validate: What is wrong with my setup?

Was it helpful?

Solution

The source file is in the wrong location. By default, it needs to go into src/main/scala/org/playground/readCsv.scala. Otherwise, it won't even get compiled.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top