Question

We're starting a new project in Python with a few proprietary algorithms and sensitive bits of logic that we'd like to keep private. We also will have a few outsiders (select members of the public) working on the code. We cannot grant the outsiders access to the small, private bits of code, but we'd like a public version to work well enough for them.

Say that our project, Foo, has a module, bar, with one function, get_sauce(). What really happens in get_sauce() is secret, but we want a public version of get_sauce() to return an acceptable, albeit incorrect, result.

We also run our own Subversion server so we have total control over who can access what.

Symlinks

My first thought was symlinking — Instead of bar.py, provide bar_public.py to everybody and bar_private.py to internal developers only. Unfortunately, creating symlinks is tedious, manual work — especially when there are really going to be about two dozen of these private modules.

More importantly, it makes management of the Subversion authz file difficult, since for each module we want to protect an exception must be added on the server. Someone might forget to do this and accidentally check in secrets... Then the module is in the repo and we have to rebuild the repository without it and hope that an outsider didn't download it in the meantime.

Multiple repositories

The next thought was to have two repositories:

private
└── trunk/
    ├── __init__.py
    └── foo/
        ├── __init__.py
        └── bar.py
public
└── trunk/
    ├── __init__.py
    └── foo/
        ├── __init__.py
        ├── bar.py
        ├── baz.py
        └── quux.py

The idea is that only internal developers will be able to checkout both private/ and public/. Internal developers will set their PYTHONPATH=private/trunk:public/trunk, but everyone else will just set PYTHONPATH=public/trunk. Then, both insiders and outsiders can from foo import bar and get the right module, right?

Let's try this:

% PYTHONPATH=private/trunk:public/trunk python
Python 2.5.1
Type "help", "copyright", "credits" or "license" for more information.
>>> import foo.bar
>>> foo.bar.sauce()
'a private bar'
>>> import foo.quux
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named quux

I'm not a Python expert, but it seems that Python has already made up its mind about module foo and searches relative to that:

>>> foo
<module 'foo' from '/path/to/private/trunk/foo/__init__.py'>

Not even deleting foo helps:

>>> import sys
>>> del foo
>>> del sys.modules['foo']
>>> import foo.quux
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named quux

Can you provide me with a better solution or suggestion?

Was it helpful?

Solution

In the __init__ method of the foo package you can change __path__ to make it look for its modules in other directories.

So create a directory called secret and put it in your private Subversion repository. In secret put your proprietary bar.py. In the __init__.py of the public foo package put in something like:

__path__.insert(0,'secret')

This will mean for users who have the private repository and so the secret directory they will get the proprietary bar.py as foo.bar as secret is the first directory in the search path. For other users, Python won't find secret and will look as the next directory in __path__ and so will load the normal bar.py from foo.

So it will look something like this:

   private
    └── trunk/
        └── secret/
            └── bar.py
    public
    └── trunk/
        ├── __init__.py
        └── foo/
            ├── __init__.py
            ├── bar.py
            ├── baz.py
            └── quux.py

OTHER TIPS

Use some sort of plugin system, and keep your plugins to your self, but also have publically available plugins that gets shipped with the open code.

Plugin systems abound. You can easily make dead simple ones yourself. If you want something more advanced I prefer the Zope Component Architecture, but there are also options like setuptools entry_points, etc.

Which one to use in your case would be a good second question.

Here's an alternate solution I noticed when reading the docs for Flask:

flaskext/__init__.py

The only purpose of this file is to mark the package as namespace package. This is required so that multiple modules from different PyPI packages can reside in the same Python package:

__import__('pkg_resources').declare_namespace(__name__)

If you want to know exactly what is happening there, checkout the distribute or setuptools docs which explain how this works.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top