Question

Anybody knows how to control/schedule Hadoop jobs using BMC Control-M software? Is it even possible?

I have tried Ooozie and want to explore more options for scheduling hadoop jobs.

Please enlighten!

Was it helpful?

Solution

The answer is YES.

And this answer is going to get even better.

Today, you can use the abundant command line interfaces available with various Hadoop components. You can then run these CLIs as commands individually or combine them into scripts embedded directly in Control-M jobs or wrapped in shell scripts (Bash is a popular one) and scheduled with Control-M. I've provided a sample script that performs some HDFS manipulaiton and then runs a MapReduce job.

The better part is coming in a few months when we will be releasing our integrated support for Hadoop. At that point (I am assuming you are familiar with BMC Control-M) we will be providing graphical forms similar to our other CMs, for defining various job types (Pig, Hive, MapReduce are all being considered but I'm not sure what will actually get implemented), integrated support for status monitoring, retrieval of job output, etc.

We have already heard from a number of customers who are using Control-M to manage their Hadoop environments.

In addition to the "mechanics" of running Hadoop jobs, you also get Control-M's capabilities for managing graphical flows, integraiton with a broad range of platfroms and applications, ability to manage Service Levels, forecasting, auditing, reporting, and much more.

I would be happy to discuss this further with you and especially since we are still in the early stages of this work, we would love to learn what your requirements are in this area. Please send me a note at joe_goldberg@bmc.com and I would be happy to set up a conference call or demo.

#!/bin/csh
#
cd /h/gron/java/hadoop/hadoop-1.0.3
bin/hadoop dfs -rmr  output_$UUID 'dfs[a-z.]+'
bin/hadoop  jar  hadoop-examples-1.0.3.jar  grep input output_$UUID 'dfs[a-z.]+'
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top