There is a class boto.emr.bootstrap_action.BootstrapAction for the bootstrap action.
Define it like the below. Most of the code is from the boto example page.
import boto.emr
from boto.emr.bootstrap_action import BootstrapAction
action = BootstrapAction(name="Bootstrap to add SimpleCV",
path="s3n://<my bucket uri>/bootstrap-simplecv.sh")
conn = boto.emr.connect_to_region('us-west-2')
jobid = conn.run_jobflow(name='My jobflow',
log_uri='s3://<my log uri>/jobflow_logs',
steps=[step], # step defined elsewhere
bootstrap_actions=[action])
And you need to define the bootstrap action. If you need another version of Python then yes, it would save time to precompile it on the exact same computer, tar it, put it in an S3 bucket, and then untar it during the bootstrap.
#!/bin/sh
# filename: bootstrap-simplecv.sh (save it in an S3 bucket)
set -e -x
sudo apt-get install python-setuptools
sudo easy_install pip
sudo pip install -U SimpleCV
I think you can leave EMR instances spinning from within boto so that the bootstrap only occurs the first time in your session. Just be careful to shut them down before you log out so you don't get a surprise on your bill.