Domanda

I have a ruby script that I want to use with Hive streaming. This script requires the use of an external gem. Because this gem is not installed on my data nodes, the script will not run.

I would prefer to be able to add this gem on a temporary basis just to run this job. Is there a way to include this gem to the distributed cache? Maybe as a zip? (e.g. ADD FILE custom_gem.zip)

È stato utile?

Soluzione

The best way I have found to do this is to manually add the files of the gem to the distributed cache.

Here is an example of using the browser Ruby gem:

I download and unzip browser-master.zip from GitHub. Then I add the entire unzipped folder to the distributed cache:

ADD FILE /home/user/browser-master

In the Ruby script that I am using in Hive, I have to tell Ruby where to find the needed files from the gem:

$.push File.expand_path("../browser-master/lib", __FILE__)
require "browser"
Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top