Effective way of keeping past projects with their working development environment?

https://softwareengineering.stackexchange.com/questions/202941

29-09-2020
|

Question

I find that whenever I want to go run a past project, it will take a long time before I can find it and before I have everything set-up again for it to be able to run.

For example, I have python projects I created in Linux, and it depends on software packages that are easily installed in Linux, yet I no longer have the Linux VM I was using. And some of my other projects depend on other variables like web server configuration, PATH variables, sdk, IDE, OS version, device, etc.

Does someone have an effective way of handling this issue? As of now I have only concerned myself with keeping the source code backed up yet it is difficult re-establish the working development environment and it is also difficult to keep the working development environment around as well.

Solution

What I have done in the past is either convert the physical development machine to a VM, or if it is already a VM, retain it for future use. It's not as efficient as I'd like for disk space usage, but space is cheap. Also, this process is so much less expensive time-wise than trying to re-configure an environment in the future should the need arise.

OTHER TIPS

My current favorite methodology is to maintain a script that installs ALL needed dependencies for a project, downloads the source, and hooks everything up. Some scripts have two modes - one for production, which usually is pretty much a subset of the other mode: development.

Some environments only take about 5 minutes to install with a script - in that case I keep a local VM with a fresh install of the target OS onto which I deploy the project script when I arrive at work in the morning - and then do all coding related work on that VM instance. Before I leave, I push all the changes via git to either my physical machine or our central repository, and terminate the VM.

If the environment takes longer to setup (long running installs, big files to download, anything like that) I do the above procedure once a week.

The benefit is that it is very easy to deploy to a new machine and/or production server, it is all documented in the script, and the script is verified very often.

The concept you are describing is configuration management. This is as it sounds, a way to identify, record, version/track, and report an environment. It is often a task that is strongly related to version control and build management, but it is distinct enough that often requires a separate strategy, even if it uses some of the same concepts and same processing and storage mechanisms.

Configuration management besides helping keep a working environment under control also helps establish a record of the different working environments in which software is used (development as mentioned, plus testing/QA, deployment to routine customers, deployment to customers that require special consideration or special configuration or build properties, and so on).

As I said, often this is a task that coincides with source version control, and often configuration management data resides next to source in both documentation and the source repository. It doesn't have to be, but often is as a matter of convenience.

Automation of some aspects of configuration management has largely improved in recent years. Some answers and comments suggested scripts as a way to promote configuration management, and scripts are a good answer to help achieve reproducible results, but often times hand crafted scripts by themselves are inconsistent and incomplete. One such way this has improved is through automatic provisioning. Systems like puppet or chef help specify software components and systems for a particular user or machine or for a particular task profile and provide 'recipes' that allow a hands off approach to setting up a complete machine or environment. It basically takes the concept of a software distribution repository and extends and generalizes it providing not only the packages of software needed for a system, but also configuration profiles particular to each package so that it is ready to use in the way that is appropriate to your situation.

Vagrant takes this in a slightly different direction and provides a way to quickly spin up virtual machines definitions, such that a VM can have its virtual software and hardware provisioned automatically, and can prove to be a convenient way to reproduce a particular representation of a hardware environment used by user of your software.

Each system (and variations) takes a bit to get set up, but has some clear value if you find the task of reloading and reconfiguring to be a common task.

Docker would be a good option. You can use a dockerfile to act as a manifest for the VM you want. You do not need to store any image, it will download the required one. Also, it can use your own images, so you could make your own base image and then add the components required by the environment.

Using docker this can also improve other parts of your workflow:

The created environment can be put on the same CVS than your project, that gives you a versioned environment (neat!)
The docker can be used to provision the live environment, lowering the headaches of launching your projects in production.
If others start working with you, all they need is the dockerfile to load that huge environment setup.

So the ideas here about using a VM are only partly right, I know that HDD's are getting bigger and bigger but it's not a reason to use up all the space you have. Also, when a VM environment needs more HDD space internally, this can be a bit tricky and chances are you'll need to rebuild one. Although file size may not be an issue, internet speed still become the bottleneck when you need to send over 5Go on a normal DSL connection.

Most systems (languages, runtimes or operating systems) have some standardized way of installing software and configurations, so try to use those. Such as:

Maven or Gradle for Java
CPAN for Perl
rpm for RedHat / Fedora
dpkg/apt-get for Linux
MSI packages for Windows

Then make installation instructions explaining exactly what needs to be installed / what steps are necessary:

Provide short instructions on what you assume to be installed (base OS, base runtime such as Java / Perl / Python...)
Write a short script that performs the required installations (ideally just a single invokation of a tool like Maven)
Test this on a fresh install (such as on a VM)

Then you should be able to recreate the environment, and others should also be able to do so (which may be important if it's not a solo project).

You might need to store the necessary installation packages somewhere, or you could just include download instructions (unless the system keeps track of those, such as apt-get or Maven). That depends on how much you trust the providers of the packages - there's probably no need to store core Debian packages, but with some small free software project, it might be a good idea.

The VM solution will also work, and is probably less work in the short run (just keep the VM). However, I feel this solution offers more flexibility, for example when changing the environment.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange