Question

I'm writing a distributed research application with Akka using a simple master and multiple worker topology, with the aim of deploying to an internal cluster of nodes or within a corporate cloud. (When Akka 2.1 becomes available I'll look into using clustering support)

My question: What is the simplest/best way to deploy my code (in the form of a folder containing an Akka microkernel) onto each node, start it up, let it do it's thing, then tear down and repeat as necessary?

  • The microkernel directory and executable is identical for every worker node, and only a few MB. Config contains the IP of a master which they will connect to.
  • I intend to manually start the master.
  • Resilience is not a concern since this is not a business critical application, rather a private research problem.
  • No important data is stored locally to the workers.
  • After the application is complete I may want to redeploy a different application without tearing down the nodes (e.g. having refined the codebase).

Update: Found out that Condor nodes to support jobs running in whole-machine mode. This should support running the microkernel as a job, and just need to make sure the workers exit properly when done.

Update 2: Someone mentioned Zookeeper might be well suited to this. Would appreciate input from anyone with experience.

Was it helpful?

Solution

Here are some ideas. I don't have experience with Akka, but I do know about grid-computing and deployment.

  1. Use an existing grid tool, such as http://www.gridgain.com (which have a GPL version). Also I've heard of people building a grid with http://www.hazelcast.com/

  2. Use a cloud-in-box, like Airframe, http://www.pistoncloud.com/press-releases/piston-cloud-launches-free-openstack-distribution/. I'm sure there must be others.

  3. Airframe comes with Cloud Foundry, I believe, but you could look at using it directly: https://micro.cloudfoundry.com/ But not sure how that version scales up.

  4. Roll your own system, install into one VM image, then clone it onto your other nodes.

When it comes to roll your own, you could do it like as follows, which is very similar to something I've already developed and works well.

  • Build your application using maven, using what ever libraries you like.

  • Push built binaries into Sonatype Nexus.

  • Build a custom launcher, which given the maven co-ordinates of a module can run it. My laucher first checks a local maven repo for the jars, if they don't exist, then it downloads them from nexus. Then it constructs a classpath of all the jars in the transitive dependency tree. Then it creates a new classloader with the new classpath, and launchers the main class via the classloader.

  • Write a service, using java-service-wrappe or similar, that at startup checks what version of the code to run by reading some maven co-ordinates and the name of a main class from config. Config could be a file on network drive, pre-configured URL, or even zookeeper. It then passes this to the launcher which downloads and runs the code.

  • Install this service onto many machines, either manually, or via cloning a vm.

  • profit!

I'm not mentioning Akka specifically, as most of your problem seems to be how to get code running on mulitple nodes.

The custom route has worked well for me, but then I only have a few servers using this, and all my proper grid computing is done in data-synapse. But I do wonder if I was to start again if one of these open source PaaS would be a good fit.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top