Question

I'm a bit new to r and I would like to use a package that allows multi cores processing in order to run glm function faster.I wonder If there is a syntax that I can use for this matter. Here is an example glm model that I wrote, can I add a parameter that will use multi cores ?

g<-glm(IsChurn~.,data=dat,family='binomial')

Thanks.

Was it helpful?

Solution 2

I used speedglm and the results are very good: using glm it took me 14.5 seconds to get results and with speedglm it took me 1.5 sec. that a 90% improvement..the code is very simple: m <- speedglm(y ~ s1 + s2,data=df). Just don't forget to install and call the package. Another issue: you can't use all variables with "." the speedglm does not recognize the dot as "all variables".

OTHER TIPS

Other usefull packages are: http://cran.r-project.org/web/packages/gputools/gputools.pdf with gpuGlm and http://cran.r-project.org/web/packages/mgcv/mgcv.pdf see mgcv.parallel section about gam(..., control=list(nthreads=nc)) or bam(..., cluster=makeCluster(nc)) where nc is the number of your real cores

A new options, is my package parglm. You can find a comparisons of computation times here. A plot of the computation time versus number of used cores on a 18 core machine in the vignette for two of the implemented methods is given below

enter image description here

The dashed line is the computation time from glm and the dotted line is the computation time from speedglm. The method with the open circles compute the Fisher information and then solves the normal equation as in speedglm. The full circles makes a QR decomposition as glm. The former is faster but less stable.

I have added some more comments on the QR method in the answer here on a related question.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top