Question

When deploying a large Java webapp (>100 MB .war) I'm currently use the following deployment process:

  • The application .war file is expanded locally on the development machine.
  • The expanded application is rsync:ed from the development machine to the live environment.
  • The app server in the live environment is restarted after the rsync. This step is not strictly needed, but I've found that restarting the application server on deployment avoids "java.lang.OutOfMemoryError: PermGen space" due to frequent class loading.

Good things about this approach:

  • The rsync minimizes the amount of data sent from the development machine to the live environment. Uploading the entire .war file takes over ten minutes, whereas an rsync takes a couple of seconds.

Bad things about this approach:

  • While the rsync is running the application context is restarted since the files are updated. Ideally the restart should happen after the rsync is complete, not when it is still running.
  • The app server restart causes roughly two minutes of downtime.

I'd like to find a deployment process with the following properties:

  • Minimal downtime during deployment process.
  • Minimal time spent uploading the data.
  • If the deployment process is app server specific, then the app server must be open-source.

Question:

  • Given the stated requirements, what is the optimal deployment process?
Was it helpful?

Solution

It has been noted that rsync does not work well when pushing changes to a WAR file. The reason for this is that WAR files are essentially ZIP files, and by default are created with compressed member files. Small changes to the member files (before compression) result in large scale differences in the ZIP file, rendering rsync's delta-transfer algorithm ineffective.

One possible solution is to use jar -0 ... to create the original WAR file. The -0 option tells the jar command to not compress the member files when creating the WAR file. Then, when rsync compares the old and new versions of the WAR file, the delta-transfer algorithm should be able to create small diffs. Then arrange that rsync sends the diffs (or original files) in compressed form; e.g. use rsync -z ... or a compressed data stream / transport underneath.

EDIT: Depending on how the WAR file is structured, it may also be necessary to use jar -0 ... to create component JAR files. This would apply to JAR files that are frequently subject to change (or that are simply rebuilt), rather than to stable 3rd party JAR files.

In theory, this procedure should give a significant improvement over sending regular WAR files. In practice I have not tried this, so I cannot promise that it will work.

The downside is that the deployed WAR file will be significantly bigger. This may result in longer webapp startup times, though I suspect that the effect would be marginal.


A different approach entirely would be to look at your WAR file to see if you can identify library JARs that are likely to (almost) never change. Take these JARs out of the WAR file, and deploy them separately into the Tomcat server's common/lib directory; e.g. using rsync.

OTHER TIPS

Update:

Since this answer was first written, a better way to deploy war files to tomcat with zero downtime has emerged. In recent versions of tomcat you can include version numbers in your war filenames. So for example, you can deploy the files ROOT##001.war and ROOT##002.war to the same context simultaneously. Everything after the ## is interpreted as a version number by tomcat and not part of the context path. Tomcat will keep all versions of your app running and serve new requests and sessions to the newest version that is fully up while gracefully completing old requests and sessions on the version they started with. Specifying version numbers can also be done via the tomcat manager and even the catalina ant tasks. More info here.

Original Answer:

Rsync tends to be ineffective on compressed files since it's delta-transfer algorithm looks for changes in files and a small change an uncompressed file, can drastically alter the resultant compressed version. For this reason, it might make good sense to rsync an uncompressed war file rather than a compressed version, if network bandwith proves to be a bottleneck.

What's wrong with using the Tomcat manager application to do your deployments? If you don't want to upload the entire war file directly to the Tomcat manager app from a remote location, you could rsync it (uncompressed for reasons mentioned above) to a placeholder location on the production box, repackage it to a war, and then hand it to the manager locally. There exists a nice ant task that ships with Tomcat allowing you to script deployments using the Tomcat manager app.

There is an additional flaw in your approach that you haven't mentioned: While your application is partially deployed (during an rsync operation), your application could be in an inconsistent state where changed interfaces may be out of sync, new/updated dependencies may be unavailable, etc. Also, depending on how long your rsync job takes, your application may actually restart multiple times. Are you aware that you can and should turn off the listening-for-changed-files-and-restarting behavior in Tomcat? It is actually not recommended for production systems. You can always do a manual or ant scripted restart of your application using the Tomcat manager app.

Your application will be unavailable to users during a restart, of course. But if you're so concerned about availability, you surely have redundant web servers behind a load balancer. When deploying an updated war file, you could temporarily have the load balancer send all requests to other web servers until the deployment is over. Rinse and repeat for your other web servers.

In any environment where downtime is a consideration, you are surely running some sort of cluster of servers to increase reliability via redundancy. I'd take a host out of the cluster, update it, and then throw it back into the cluster. If you have an update that cannot run in a mixed environment (incompatible schema change required on the db, for example), you are going to have to take the whole site down, at least for a moment. The trick is to bring up replacement processes before dropping the originals.

Using tomcat as an example - you can use CATALINA_BASE to define a directory where all of tomcat's working directories will be found, separate from the executable code. Every time I deploy software, I deploy to a new base directory so that I can have new code resident on disk next to old code. I can then start up another instance of tomcat which points to the new base directory, get everything started up and running, then swap the old process (port number) with the new one in the load balancer.

If I am concerned about preserving session data across the switch, I can set up my system such that every host has a partner to which it replicates session data. I can drop one of those hosts, update it, bring it back up so that it picks the session data back up, and then switch the two hosts. If I've got multiple pairs in the cluster, I can drop half of all pairs, then do a mass switch, or I can do them a pair at a time, depending upon the requirements of the release, requirements of the enterprise, etc. Personally, however, I prefer to just allow end-users to suffer the very occasional loss of an active session rather than deal with trying to upgrade with sessions intact.

It's all a tradeoff between IT infrastructure, release process complexity, and developer effort. If your cluster is big enough and your desire strong enough, it is easy enough to design a system that can be swapped out with no downtime at all for most updates. Large schema changes often force actual downtime, since updated software usually cannot accommodate the old schema, and you probably cannot get away with copying the data to a new db instance, doing the schema update, and then switching the servers to the new db, since you will have missed any data written to the old after the new db was cloned from it. Of course, if you have resources, you can task developers with modifying the new app to use new table names for all tables that are updated, and you can put triggers in place on the live db which will correctly update the new tables with data as it is written to the old tables by the prior version (or maybe use views to emulate one schema from the other). Bring up your new app servers and swap them into the cluster. There are a ton of games you can play in order to minimize downtime if you have the development resources to build them.

Perhaps the most useful mechanism for reducing downtime during software upgrades is to make sure that your app can function in a read-only mode. That will deliver some necessary functionality to your users but leave you with the ability to make system-wide changes that require database modifications and such. Place your app into read-only mode, then clone the data, update schema, bring up new app servers against new db, then switch the load balancer to use the new app servers. Your only downtime is the time required to switch into read-only mode and the time required to modify the config of your load balancer (most of which can handle it without any downtime whatsoever).

My advice is to use rsync with exploded versions but deploy a war file.

  1. Create temporary folder in the live environment where you'll have exploded version of webapp.
  2. Rsync exploded versions.
  3. After successfull rsync create a war file in temporary folder in the live environment machine.
  4. Replace old war in the server deploy directory with new one from temporary folder.

Replacing old war with new one is recommended in JBoss container (which is based on Tomcat) beacause it'a atomic and fast operation and it's sure that when deployer will start entire application will be in deployed state.

Can't you make a local copy of the current web application on the web server, rsync to that directory and then perhaps even using symbolic links, in one "go", point Tomcat to a new deployment without much downtime?

Your approach to rsync the extracted war is pretty good, also the restart since I believe that a production server should not have hot-deployment enabled. So, the only downside is the downtime when you need to restart the server, right?

I assume all state of your application is hold in the database, so you have no problem with some users working on one app server instance while other users are on another app server instance. If so,

Run two app servers: Start up the second app server (which listens on other TCP ports) and deploy your application there. After deployment, update the Apache httpd's configuration (mod_jk or mod_proxy) to point to the second app server. Gracefully restarting the Apache httpd process. This way you will have no downtime and new users and requests are automatically redirected to the new app server.

If you can make use of the app server's clustering and session replication support, it will be even smooth for users which are currently logged in, as the second app server will resync as soon as it starts. Then, when there are no accesses to the first server, shut it down.

This is dependant on your application architecture.

One of my applications sits behind a load-balancing proxy, where I perform a staggered deployment - effectively eradicating downtime.

If static files are a big part of your big WAR (100Mo is pretty big), then putting them outside the WAR and deploying them on a web server (e.g. Apache) in front of your application server might speed up things. On top of that, Apache usually does a better job at serving static files than a servlet engine does (even if most of them made significant progress in that area).

So, instead of producing a big fat WAR, put it on diet and produce:

  • a big fat ZIP with static files for Apache
  • a less fat WAR for the servlet engine.

Optionally, go further in the process of making the WAR thinner: if possible, deploy Grails and other JARs that don't change frequently (which is likely the case of most of them) at the application server level.

If you succeed in producing a lighter WAR, I wouldn't bother of rsyncing directories rather than archives.

Strengths of this approach:

  1. The static files can be hot "deployed" on Apache (e.g. use a symbolic link pointing on the current directory, unzip the new files, update the symlink and voilà).
  2. The WAR will be thinner and it will take less time to deploy it.

Weakness of this approach:

  1. There is one more server (the web server) so this add (a bit) more complexity.
  2. You'll need to change the build scripts (not a big deal IMO).
  3. You'll need to change the rsync logic.

I'm not sure if this answers your question, but I'll just share on the deployment process I use or encounter in the few projects I did.

Similiar to you, I do not ever recall making a full war redeployment or update. Most of the time, my updates are restricted to a few jsp files, maybe a library, some class files. I am able to manage and determine which are the affected artifacts, and usually, we packaged those update in a zip file, along with an update script. I will run the update script. The script does the following:

  • Backup the files that will be overwritten, maybe to a folder with today's date and time.
  • Unpackage my files
  • Stop the application server
  • Move the files over
  • Start the application server

If downtime is a concern, and they usually are, my projects are usually HA, even if they are not sharing state but using a router that provide sticky session routing.

Another thing that I am curious would be, why the need to rsync? You should able to know what are the required changes, by determining them on your staging/development environment, not performing delta checks with live. In most cases, you would have to tune your rsync to ignore files anyway, like certain property files that define resources a production server use, like database connection, smtp server, etc.

I hope this is helpful.

At what is your PermSpace set? I would expect to see this grow as well but should go down after collection of the old classes? (or does the ClassLoader still sit around?)

Thinking outloud, you could rsync to a separate version- or date-named directory. If the container supports symbolic links, could you SIGSTOP the root process, switch over the context's filesystem root via symbolic link, and then SIGCONT?

As for the early context restarts. All containers have configuration options to disable auto-redeploy on class file or static resource changes. You probably can't disable auto redeploys on web.xml changes so this file is the last one to update. So if you disable to auto redeploy and update the web.xml as the last one you'll see the context restart after the whole update.

We upload the new version of the webapp to a separate directory, then either move to swap it out with the running one, or use symlinks. For example, we have a symlink in the tomcat webapps directory named "myapp", which points to the current webapp named "myapp-1.23". We upload the new webapp to "myapp-1.24". When all is ready, stop the server, remove the symlink and make a new one pointing to the new version, then start the server again.

We disable auto-reload on production servers for performance, but even so, having files within the webapp changing in a non-atomic manner can cause issues, as static files or even JSP pages could change in ways that cause broken links or worse.

In practice, the webapps are actually located on a shared storage device, so clustered, load-balanced, and failover servers all have the same code available.

The main drawback for your situation is that the upload will take longer, since your method allows rsync to only transfer modified or added files. You could copy the old webapp folder to the new one first, and rsync to that, if it makes a significant difference, and if it's really an issue.

Tomcat 7 has a nice feature called "parallel deployment" that is designed for this use case.

The gist is that you expand the .war into a directory, either directly under webapps/ or symlinked. Successive versions of the application are in directories named app##version, for example myapp##001 and myapp##002. Tomcat will handle existing sessions going to the old version, and new sessions going to the new version.

The catch is that you have to be very careful with PermGen leaks. This is especially true with Grails that uses a lot of PermGen. VisualVM is your friend.

Just use 2 or more tomcat servers with a proxy over it. That proxy can be of apache/nignix/haproxy.

Now in each of the proxy server there is "in" and "out" url with ports are configured.

First copy your war in the tomcat without stoping the service. Once war is deployed it is automatically opened by the tomcat engine.

Note cross check unpackWARs="true" and autoDeploy="true" in node "Host" inside server.xml

It look likes this

  <Host name="localhost"  appBase="webapps"
        unpackWARs="true" autoDeploy="true"
        xmlValidation="false" xmlNamespaceAware="false">

Now see the logs of tomcat. If no error is there it means it is up successfully.

Now hit all APIs for testing

Now come to your proxy server .

Simply change the background url mapping with the new war's name. Since registering with the proxy servers like apache/nignix/haProxy took very less time, you will feel minimum downtime

Refer -- https://developers.google.com/speed/pagespeed/module/domains for mapping urls

You're using Resin, Resin has built in support for web app versioning.

http://www.caucho.com/resin-4.0/admin/deploy.xtp#VersioningandGracefulUpgrades

Update: It's watchdog process can help with permgenspace issues too.

Not a "best practice" but something I just thought of.

How about deploying the webapp through a DVCS such as git?

This way you can let git figure out which files to transfer to the server. You also have a nice way to back out of it if it turns out to be busted, just do a revert!

I wrote a bash script that takes a few parameters and rsyncs the file between servers. Speeds up rsync transfer a lot for larger archives:

https://gist.github.com/3985742

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top