Python code jailing

https://stackoverflow.com/questions/10587941

08-06-2021
|

Question

I have bunch of python-projects with untrusted WSGI-apps inside them. I need to run them simulatiously and safely. So I need restrictions for directory access, python module usage and limitations for CPU and Memory.

I consider two approaches:

Import via imp-module WSGI-object from defined file, and running it with pysandbox. Now I have SandboxError: Read only object when doing:

self.config  = SandboxConfig('stdout')
self.sandbox = Sandbox(self.config)
self.s = imp.get_suffixes()
wsgi_obj = imp.load_module("run", open(path+"/run.py", "r"), path, self.s[2]).app
…
return self.sandbox.call(wsgi_obj, environ, start_response)

Modify Python interpreter, exclude potentially risky modules, run in parallel processes, communicate via ZMQ/Unix sockets. I even don't know where to start here.

What could you recommend?

Solution

I would run your applications with gunicorn, with a separate process and configuration for each app, and with user-level permissions (each untrusted app on a different user). Each gunicorn instance would serve on localhost on a user-range port, and nginx or another webserver could connect into them to route and serve them to the web.

Heroku takes this a step further and sandboxes each gunicorn instance (or unicorn or apache or arbitrary other server) in a virtual machine. This is probably the most secure possible way to do things, and definitely the best option for reliably limiting CPU and memory usage, but you may not need to go that far depending on your requirements.

One of the advantages of this kind of approach is that each application can run on a different version of Python if appropriate; with the virtual machine sandbox they can even run on different operating systems entirely.

Edit: To limit memory usage without using a VM sandbox approach, see this question. To limit CPU usage, tweak the gunicorn settings -- spin up one gevent-style worker per core an application is allowed to use.

Edit again: One completely different approach would be to use PyPy's sandboxing mechanism which should be much more secure than CPython plus a sandboxing module. However, I would prefer the guincorn or gunicorn + virtual machine approach.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow