Monitoring Progress/Debugging Parallel R Scripts
Among the choices I have for quickly parallelizing simple code (snowfall, foreach, and so on), what are my options for showing the progress of all the slave processes? Do any of the offerings excel in this regard?
I've seen that snowfall 1.70 has
sfCat(), but it doesn't seem to cat output to the master R session.
That's where it can turn into black art... I notice that you did not list MPI or PVM -- those old workhorses of parallel computing do have monitors. You may find solutions by going outside of R and relying on job schedulers (slurm, torque, ...)
If you can't do that (and hey, there are reasons why we like the simplicity of snow, foreach, ...) then maybe you can alter your jobs to log a 'heartbeat' or progress message every N steps. You can log to text files (if you have a NFS or SMB/CIFS share), log to a database, or heck, send a tweet with R. It will most likely be specific to your app, and yes, it will have some cost.