Question

Is it necessary to remove the exported variable after the parallel computation of Snow ends? I found the memory of 'rsession' process was not changed too much even if clusterEvalQ was called. I suspect there is some memory problem of my sample code below

library(snow)
cl2 <- makeCluster(3, type = "SOCK")
data <- rep(1:10000,10000)

clusterExport(cl2,"data")

# is remove neccssary?
clusterEvalQ(cl2, rm( data, pos=globalenv() ) )  

stopCluster(cl2) 

enter image description here

Was it helpful?

Solution

Removing exported data from the cluster workers will free memory on the cluster workers, but it doesn't free memory on the master process, which is your local R session. That could be very useful if you're going to do more work on the cluster that doesn't require that data, but there's no real point to it if you're just going to stop the cluster.

The memory use on the master process could increase a lot when you call clusterExport because it has to serialize all of the exported objects, but it doesn't retain any references to that memory, so it should all eventually be freed by garbage collection. There's nothing that you have to do, but I agree with mrip that you could call gc if you want to free it sooner. And I don't believe there is any problem with your sample code.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top