Question

I've been running parallel jobs on a SGE cluster using IPython parallel. I submit my jobs and retrieve the results from the hub database (SQlite) at a later time when all the jobs have finished, using the jobs message ID. This worked fine till my controller crashed; on restarting the controller, I couldn't retrieve the jobs submitted to the old controller. I got this error:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/7.3/lib/python2.7/site-packages/IPython/parallel/controller/hub.py", line 1281, in get_results
    raise KeyError('No such message: '+msg_id)
KeyError: u'No such message: 7f1996c0-deb0-4d7c-8782-619c86d2d064'

The database file (tasks.db) still exists and has the same size as before the hub crashed. So, I'm sure the results are in the database. Can I retrieve them using the new controller? Also, if I use the bd_query command:

rc.db_query({'msg_id' : '7f1996c0-deb0-4d7c-8782-619c86d2d064'})

I get an empty result.

Was it helpful?

Solution

By default, starting a Controller creates a new table (with a UUID). If you want each Hub session to keep adding to the same table, add this line to your ipcontroller-config.py:

c.SQLiteDB.table = 'ipython-tasks' # or any other value

With that change, every subsequent Hub session will build on the same task history. IPython 2.0 will make this the default behavior.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top