سؤال

I have Hadoop installed and testing fine, however unable to find any instructions for a n00b on

How to setup cascading and cascading.jruby. Where to place the cascading Jars and how to configure jading to build the ruby assemblies correctly?

Is anyone using jenkins to build this automatically?

Edit: more details I'm trying to build the example word count job from https://github.com/etsy/cascading.jruby

I've installed

  1. hadoop, and run the tests successfully.
  2. installed jruby
  3. gem install cascading.jruby
  4. jade - https://github.com/etsy/jading
  5. installed ant

  6. created the wordcount sample wc.rb

  7. Run jade to compile the wc.rb to a jar

    jade wc.rb

  8. I get the following compile error

    Buildfile: build.xml does not exist! Build failed RuntimeError: Ant retrieve failed (root) at /usr/bin/hjade:89

Which makes sense looking at the jade code, but this isn't covered in the example usage? What am I missing here?

هل كانت مفيدة؟

المحلول

Sorry for the delay; this is my first answer, here.

The issue you describe, Jading not being able to locate its Ant build script when called from a symlink, is indeed an issue. I'd recommend just adding your Jading clone to your PATH rather than creating symlinks (or submit a pull request to fix the issue!).

To address some of your other concerns, I've created a Getting Started page in the Jading wiki which may be of some help. It walks you through getting up and running with local and remote cascading.jruby jobs without installing anything besides preqs (Java, Ant, JRuby, and the Hadoop client+config). Included now is a full example wordcount script that should function both locally and on a Hadoop cluster, and has been tested on Etsy's own internal cluster.

And backing up further still to address your question about Jenkins, yes, at Etsy we use Jenkins to build and deploy cascading.jruby (and Scalding) to our cluster. However, that build process does not currently use Jading to produce the job jar. Our build predated Jading and Jading was an attempt to release a cleaner version of the process we go through to build that jar. Our build could easily using Jading (and the original examples came from actual uses on our code), but we have slightly different requirements for the artifacts produced by our build.

If you have any other issues with Jading, feel free to submit issues or pull requests to the github project.

نصائح أخرى

If you are using jruby. You must be using bundler as well. In that case you can add cascading.jruby as a dependency in your gemfile.

You could anyways try installing from your project folder as:

gem install 'cascading.jruby'

Hope this Helps.

I've got the working end to end now. I had created symlinks to the hadoop, jading binaries in /usr/local/bin

The scripts need to be run from their own directories in order to find the supporting files

i.e. the following works: (assuming the cascading.jruby example is in ~/dev/cascading.jruby.demo/wc.rb

cd /usr/local/jading
./jade ~/dev/cascading.jruby.demo/wc.rb

# creates a jade.jar locally in jading folder
cd /usr/local/hadoop
./bin/hadoop jar /usr/local/jading/jade.jar ~/dev/cascading.jruby.demo/wc.rb ~/dev/cascading.jruby.demo/sampledata/in.txt
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top