Pergunta

I am looking for a multi-node + multi-core example for training a model using caret. I currently use the multi-core functionality and it works quite well, but for certain tasks that require a lot of iterations, I was wondering if it is possible to leverage a multi-node setup whereby I can train in parallel across all the cores of all the nodes. So, if I have 2 nodes of 24 core each, instead of training on single node with 24 cores, I would train using 48 cores leveraging both the nodes.

Is there any existing functionality that achieves this / does it have to be coded manually,

Thanks,

  • Raj.
Foi útil?

Solução

To allow caret to use multiple nodes and multiple cores you need to create and register the appropriate foreach parallel backend. You can do that with the doSNOW package by creating a snow cluster that starts multiple workers per node by specifying the same hostname multiple times. To start 24 workers per node, you could use:

library(doSNOW)
cl <- makeSOCKcluster(c(rep('node1', 24), rep('node2', 24)))
registerDoSNOW(cl)

The makeSOCKcluster function uses ssh to start the workers, so you should set up password-less ssh. That can be difficult (impossible?) to do on Windows, but it's commonly done on Linux and Mac OS X. If you're using a Linux cluster, it may be better to use makeMPIcluster.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top