How to deploy a Python application with libraries as source with no further dependencies?

https://stackoverflow.com/questions/527510

22-08-2019
|

Question

Background: I have a small Python application that makes life for developers releasing software in our company a bit easier. I build an executable for Windows using py2exe. The application as well as the binary are checked into Subversion. Distribution happens by people just checking out the directory from SVN. The program has about 6 different Python library dependencies (e.g. ElementTree, Mako)

The situation: Developers want to hack on the source of this tool and then run it without having to build the binary. Currently this means that they need a python 2.6 interpreter (which is fine) and also have the 6 libraries installed locally using easy_install.

The Problem

This is not a public, classical open source environment: I'm inside a corporate network, the tool will never leave the "walled garden" and we have seriously inconvenient barriers to getting to the outside internet (NTLM authenticating proxies and/or machines without direct internet access).
I want the hurdles to starting to hack on this tool to be minimal: nobody should have to hunt for the right dependency in the right version, they should have to execute as little setup as possible. Optimally the prerequisites would be having a Python installation and just checking out the program from Subversion.

Anecdote: The more self-contained the process is the easier it is to repeat it. I had my machine swapped out for a new one and went through the unpleasant process of having to reverse engineer the dependencies, reinstall distutils, hunting down the libraries online and getting them to install (see corporate internet restrictions above).

Solution

I sometimes use the approach I describe below, for the exact same reason that @Boris states: I would prefer that the use of some code is as easy as a) svn checkout/update - b) go.

But for the record:

I use virtualenv/easy_install most of the time.
I agree to a certain extent to the critisisms by @Ali A and @S.Lott

Anyway, the approach I use depends on modifying sys.path, and works like this:

Require python and setuptools (to enable loading code from eggs) on all computers that will use your software.
Organize your directory structure this:

project/
    *.py
    scriptcustomize.py
    file.pth

    thirdparty/
        eggs/
            mako-vNNN.egg
            ... .egg
        code/
            elementtree\
                *.py
            ...

In your top-level script(s) include the following code at the top:

from scriptcustomize import apply_pth_files
apply_pth_files(__file__)

Add scriptcustomize.py to your project folder:

import os
from glob import glob
import fileinput
import sys

def apply_pth_files(scriptfilename, at_beginning=False):
    """At the top of your script:
    from scriptcustomize import apply_pth_files
    apply_pth_files(__file__)

    """
    directory = os.path.dirname(scriptfilename)
    files = glob(os.path.join(directory, '*.pth'))
    if not files:
        return
    for line in fileinput.input(files):
        line = line.strip()
        if line and line[0] != '#':
            path = os.path.join(directory, line)
            if at_beginning:
                sys.path.insert(0, path)
            else:
                sys.path.append(path)

Add one or more *.pth file(s) to your project folder. On each line, put a reference to a directory with packages. For instance:

# contents of *.pth file
thirdparty/code
thirdparty/eggs/mako-vNNN.egg

I "kind-of" like this approach. What I like: it is similar to how *.pth files work, but for individual programs instead of your entire site-packages. What I do not like: having to add the two lines at the beginning of the top-level scripts.
Again: I use virtualenv most of the time. But I tend to use virtualenv for projects where I have tight control of the deployment scenario. In cases where I do not have tight control, I tend to use the approach I describe above. It makes it really easy to package a project as a zip and have the end user "install" it (by unzipping).

OTHER TIPS

Just use virtualenv - it is a tool to create isolated Python environments. You can create a set-up script and distribute the whole bunch if you want.

"I dislike the fact that developers (or me starting on a clean new machine) have to jump through the distutils hoops of having to install the libraries locally before they can get started"

Why?

What -- specifically -- is wrong with this?

You did it to create the project. Your project is so popular others want to do the same.

I don't see a problem. Please update your question with specific problems you need solved. Disliking the way open source is distributed isn't a problem -- it's the way that open source works.

Edit. The "walled garden" doesn't matter very much.

Choice 1. You could, BTW, build an "installer" that runs easy_install 6 times for them.

Choice 2. You can save all of the installer kits that easy_install would have used. Then you can provide a script that does an unzip and a python setup.py install for all six.

Choice 3. You can provide a zipped version of your site-packages. After they install Python, they unzip your site-packages directory into `C:\Python2.5\lib\site-packages``.

Choice 4. You can build your own MSI installer kit for your Python environment.

Choice 5. You can host your own pypi-like server and provide an easy_install that checks your server first.

I agree with the answers by Nosklo and S.Lott. (+1 to both)

Can I just add that what you want to do is actually a terrible idea.

If you genuinely want people to hack on your code, they will need some understanding of the libraries involved, how they work, what they are, where they come from, the documentation for each etc. Sure provide them with a bootstrap script, but beyond that you will be molly-coddling to the point that they are clueless.

Then there are specific issues such as "what if one user wants to install a different version or implementation of a library?", a glaring example here is ElementTree, as this has a number of implementations.

I'm not suggesting that this is a great idea, but usually what I do in situations like these is that I have a Makefile, checked into subversion, which contains make rules to fetch all the dependent libraries and install them. The makefile can be smart enough to only apply the dependent libraries if they aren't present, so this can be relatively fast.

A new developer on the project simply checks out from subversion and then types "make".

This approach might work well for you, given that your audience is already used to the idea of using subversion checkouts as part of their fetch process. Also, it has the nice property that all knowledge about your program, including its external dependencies, are captured in the source code repository.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow