Question

The other day I needed to archive a lot of data on our network and I was frustrated I had no immediate way to harness the power of multiple machines to speed-up the process.

I understand that creating a distributed job management system is a leap from a command-line archiving tool.

I'm now wondering what the simplest solution to this type of distributed performance scenario could be. Would a custom tool always be a requirement or are there ways to use standard utilities and somehow distribute their load transparently at a higher level?

Thanks for any suggestions.

Was it helpful?

Solution

One way to tackle this might be to use a distributed make system to run scripts across networked hardware. This is (or used to be) an experimental feature of (some implementations of) GNU Make. Solaris implements a dmake utility for the same purpose.

Another, more heavyweight, approach might be to use Condor to distribute your archiving jobs. But I think you wouldn't install Condor just for the twice-yearly archiving runs, it's more of a system for regularly scavenging spare cycles from networked hardware.

The SCons build system, which is really a Python-based replacement for make, could probably be persuaded to hand work off across the network.

Then again, you could use scripts to ssh to start jobs on networked PCs.

So there are a few ways you could approach this without having to take up parallel programming with all the fun that that entails.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top