Question

I'm trying to implement an application that programs tasks based on some user input. The users can put a number of IPs with telnet commands associated with them (one to one relationship), a frequency of execution, and 2 groups (cluster, objectClass).

The user should be able to add/remove IPs, Clusters, commands, etc, at runtime. They should also be able to interrupt the executions.

This application should be able to send the telnet commands to the IPs, wait for a response and save the response in a database based on the frequency. The problem I'm having is trying to make all of this multithreaded, because there are at least 60,000 IPs to telnet, and doing it in a single thread would take too much time. One thread should process a group of IPs in the same cluster with the same objectClass.

I've looked at Quartz to schedule the jobs. With Quartz I tried to make a dynamic job that took a list of IPs (with commands), processed them and saved the result to database. But then I ran into the problem of the different timers that users gave. The examples on the Quartz webpage are incomplete and don't go too much into detail.

Then I tried to do it the old fashioned way, using java Threads, but I need to have exception handling and parameter passing, Threads don't do that. Then I discovered the Callables and Executors but I can't schedule tasks with Callables.

So Now I'm stumped, what do I do?

Was it helpful?

Solution

OK, here are some ideas. Take with the requisite grain of salt.

First, create a list of all of the work that you need to do. I assume you have this in tables somewhere and you can make a join that looks like this:

cluster | objectClass | ip-address | command | frequency | last-run-time

this represents all of the work your system needs to do. For the sake of explanation, I'll say frequency can take the form of "1 per day", "1 per hour", "4 per hour", "every minute". This table has one row per (cluster,objectClass,ip-address,command). Assume a different table has a history of runs, with error messages and other things.

Now what you need to do is read that table, and schedule the work. For scheduling use one of these:

ScheduledExecutorService exec = Executors...

When you schedule something, you need to tell it how often to run (easy enough with the frequencies we've given), and a delay. If something is to run every minute and it last ran 4 min 30 seconds ago, the initial delay is zero. If something is to run each hour the the initial delay is (60 min - 4.5 min = 55.5 min).

ScheduledFuture<?> handle = exec.scheduleAtFixedRate(...);

More complex types of scheduling are why things like Quartz exist, but basically you just need a way to resolve, given(schedule, last-run) an elapsed time to the next execution. If you can do that, then instead of scheduleAtFixedRate(...) you can use schedule(...) and then schedule the next run of a task as that task completes.

Anyway, when you schedule something, you'll get a handle back to it

ScheduledFuture<?> handle = exec.scheduleAtFixedRate(...);

Hold this handle in something that's accessible. For the sake of argument let's say it's a map by TaskKey. TaskKey is (cluster | objectClass | ip-address | command) together as an object.

Map<TaskKey,ScheduledFuture<?>> tasks = ...;

You can use that handle to cancel and schedule new jobs.

cancelForCustomer(CustomerId id) {
  List<TaskKey> keys = db.findAllTasksOwnedByCustomer(id);
  for(TaskKey key : keys) {
    ScheduledFuture<?> f = tasks.get(key);
    if(f!=null) f.cancel();
  }
}

For parameter passing, create an object to represent your work. Create one of these with all the parameters you need.

class HostCheck implements Runnable {
  private final Address host;
  private final String command;
  private final int something;
  public HostCheck(Address host, String command; int something) {
    this.host = host; this.command = command; this.something = something;
  }
  ....
}

For exception handling, localize that all into your object

class HostCheck implements Runnable {
  ...
  public void run() {
    try {
      check();
      scheduleNextRun(); // optionally, if fixed-rate doesn't work
    } catch( Exception e ) {
      db.markFailure(task); // or however.
      // Point is tell somebody about the failure.
      // You can use this to decide to stop scheduling checks for the host
      // or whatever, but just record the info now and us it to influence
      // future behavior in, er, the future.
    }
  }
}

OK, so up to this point I think we're in pretty good shape. Lots of detail to fill in but it feels manageable. Now we get to some complexity, and that's the requirement that execution of "cluster/objectClass" pairs are serial.

There are a couple of ways to handle this.

If the number of unique pairs are low, you can just make Map<ClusterObjectClassPair,ScheduledExecutorService>, making sure to create single-threaded executor services (e.g., Executors.newSingleThreadScheduledExecutor()). So instead of a single scheduling service (exec, above), you have a bunch. Simple enough.

If you need to control the amount of work you attempt concurrently, then you can have each HealthCheck acquire a permit before execution. Have some global permit object

public static final Semaphore permits = java.util.concurrent.Semaphore(30);

And then

class HostCheck implements Runnable {
  ...
  public void run() {
    permits.acquire()
    try {
      check();
      scheduleNextRun();
    } catch( Exception e ) {
      // regular handling
    } finally {
      permits.release();
    }
  }
}

You only have one thread per ClusterObjectClassPair, which serializes that work, and then permits just limit how many ClusterObjectClassPair you can talk to at a time.

I guess this turned it a quite a long answer. Good luck.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top